RAYTHEON CO SUDBURY MASS EQUIPMENT DIV F/G 22/2 MILSATCOM SPACECRAFT PROCESSSING STUDY. TECHNOLOGICAL LIMITATIO--ETC(U) NOV 78 A A CASTRO, J EACHUS, F HOWES, E LEWIS DCA100-78-C-0012 ER78-4370 SBIE-AD-E100 167 NL AD-A066 455 UNCLASSIFIED 1 OF 2 AD A086455 Ü - Nilskin

DDC FILE COPY

AD AO 66455

# MILSATCOM SPACECRAFT PROCESSING STUDY

TASK 2 - TECHNOLOGICAL LIMITATIONS ON SATELLITE PROCESSOR APPLICATIONS

FINAL REPORT

ER78-4370

November 15, 1978

Prepared Under

Contract No. DCA 100-78-C-0012

Prepared For

DEFENSE COMMUNICATION AGENCY
Washington, D.C. 20305

Prepared By

RAYTHEON COMPANY EQUIPMENT DIVISION

Communication Systems Directorate Sudbury, Massachusetts 01776



DISTRIBUTION STATEMENT A

Approved for public releases
Distribution Unlimited

SECURITY CLASSIFICATION OF THIS PAGE (When Date Entered) READ INSTRUCTIONS BEFORE COMPLETING FORM REPORT DOCUMENTATION PAGE RECIPIENT'S CATALOG NUM DCA 100-78-C-001242 4. TITLE (and Subtitle) Final Report 2 MILSATCOM Spacecraft Processing Study # May 78 - Oct 78 010 Task Technological Limitations on Satellite Processor Applications ER78-4378 A. /Castro, J. /Eachus, F. /Howes, E. /Lewis J. Stiffler PERFORMING ORGANIZATION NAME AND ADDRESS PROGRAM ELEMENT, PROJECT, TAS AREA & WORK UNIT NUMBERS Raytheon Company Communication Systems Directorate Equipment Division PR-800-77-6 Sudbury, Mass., 01776 11. CONTROLLING OFFICE NAME AND ADDRESS 15 Nove Defense Communication Agency Washington, D.C., 20305 100 14. MONITORING AGENCY HAME & ADDRESS(II different from Controlling Office) Unclassified IA METERITION STATEMENT (of this Approved for public release; distribution unlimited. 17. DISTRIBUTION STATEMENT (of the abstract of 18. SUPPLEMENTARY NOTES

19. KEY WORDS (Continue on reverse side if necessary and identify by block number)

Satellite Communications, on board processing, digital signal processing, digital processor architecture, frequency hopping, direct sequence, mission reliability, digital device technology.

20. ABSTRACT (Continue on reverse side it necessary and identity by block number)

This Study Report describes the technological limitations for performing on board signal processing in MILSATCOM satellites. The operational use of this hardware is for the post 1990 time frame and for the UHF, SHF and EHF bands. The study investigates the limitations imposed by frequency of use, feasible bandwidths, processing gains, chip rates, baseband processing rates and weight/volume/mission reliability.

DD 1 JAN 79 1473 EDITION OF 1 NOV 68 IS OBSOLETE

SECURITY CLASSIFICATION OF THIS PAGE (Then Date Entered)

495937



## TABLE OF CONTENTS

| Paragraph |                                                                      | Page |
|-----------|----------------------------------------------------------------------|------|
| Foreword  |                                                                      | v    |
| Abstract  |                                                                      | vi   |
| 1.        | INTRODUCTION                                                         | 1-1  |
| 1.1       | Satellite Processor Configuration                                    | 1-2  |
| 2.        | TECHNOLOGICAL LIMITATIONS OF SPACECRAFT PROCESSOR ARCHITECTURES      | 2-1  |
| 2.1       | Architectural Trade Offs                                             | 2-2  |
| 2.1.1     | Evaluation Criteria                                                  | 2-3  |
| 2.1.2     | Fault Tolerance Goals                                                | 2-3  |
| 2.1.3     | System Loading Requirements                                          | 2-3  |
| 2.1.4     | Processor Word Length                                                | 2-4  |
| 2.1.5     | Processor Sizing                                                     | 2-4  |
| 2.1.6     | Bus Activity Data                                                    | 2-7  |
| 2.1.7     | Core Processor Description                                           | 2-7  |
| 2.1.8     | Bus Interface                                                        | 2-10 |
| 2.1.9     | Technology Assumptions                                               | 2-11 |
| 2.1.10    | Alternate Processor Architectures                                    | 2-11 |
| 2.2       | Distributed Processing System                                        | 2-12 |
| 2.2.1     | General-Purpose Processor Description                                | 2-12 |
| 2.2.2     | I/O Units                                                            | 2-16 |
| 2.2.3     | Distributed Processor Reliability                                    | 2-16 |
| 2.2.4     | Bus Activity for Distributed System                                  | 2-19 |
| 2.3       | Hierarchical Processing System                                       | 2-20 |
| 2.3.1     | Processor Descriptions for Hierarchical System                       | 2-20 |
| 2.3.2     | Hierarchical Processor Reliability                                   | 2-23 |
| 2.3.3     | Bus Activity for the Hierarchical System                             | 2-23 |
| 2.3.4     | Externally Redundant Pipeline Processing System Description          | 2-26 |
| 2.4       | Processing Units for Externally Redundant Pipeline System            | 2-26 |
| 2.4.1     | Externally Redundant Pipeline Processor Reliability                  | 2-31 |
| 2.4.2     | Bus Activity for the Externally Redundant Pipeline Processing System | 2-31 |
| 2.5       | Internally Redundant Pipeline Processing System Description          | 2-34 |
| 2.5.1     | Processing Units for Internally Redundant Pipeline System            | 2-34 |
| 2.5.2     | Internally Redundant Pipeline Processor Reliability                  | 2-37 |
| 2.6       | Summary                                                              | 2-37 |
| 3.        | HARDWARE LIMITATIONS ON SATELLITE PROCESSOR APPLICATIONS             | 3-1  |
| 3.1       | Analog Signal Processing Hardware Limitations                        | 3-1  |
| 3.1.1     | Processing of Frequency Hopped Signals                               | 3-1  |

# TABLE OF CONTENTS (Cont)

| Paragraph  |                                                | Page |
|------------|------------------------------------------------|------|
| 3.1.2      | Processing of Direct Sequence Signals          | 3-8  |
| 3.1.3      | IF and Baseband Processing                     | 3-12 |
| 3.2        | Digital Signal Processing Hardware Limitations | 3-14 |
| 3.2.1      | State-of-the-Art Device Technologies for 1985  | 3-14 |
| 3.2.2      | Components for a Processor Satellite           | 3-41 |
| APPENDIX A | Bit Rippler Discussion                         | A-1  |
|            |                                                |      |

## LIST OF ILLUSTRATIONS

| Figure |                                                                                               | Page |
|--------|-----------------------------------------------------------------------------------------------|------|
| 1-1    | Satellite Processor Configuration                                                             | 1-3  |
| 2-1    | Core Processor Block Diagram                                                                  | 2-9  |
| 2-2    | Distributed Processing System                                                                 | 2-13 |
| 2-3    | General Purpose Processor for Distributed System                                              | 2-15 |
| 2-5    | Modulator Processor                                                                           | 2-17 |
| 2-6    | Hierarchical Processing System                                                                | 2-21 |
| 2-7    | General Purpose Processor for Hierarchical System                                             | 2-23 |
| 2-8    | Control Processor for Hierarchical System                                                     | 2-24 |
| 2-9    | External Redundant Pipeline Processing System                                                 | 2-27 |
| 2-10   | Externally Redundant Pipeline Processor System                                                | 2-28 |
| 2-11   | Processor for Pipeline System                                                                 | 2-30 |
| 2-12   | Bus Controller for Externally Redundant System                                                | 2-32 |
| 2-13   | Internally Redundant Pipeline Processing System                                               | 2-35 |
| 2-14   | Bus Controller for Internally Redundant Pipeline Processing System                            | 2-36 |
| 2-15   | Comparison of Processor Configurations (Optimized for P = 0.95 for 10 year Mission)           | 2-40 |
| 2-16   | Comparison of Processor Configurations (Optimized for P = 0.95 for 5 year Mission)            | 2-41 |
| 3-1    | Degradation Due to Frequency Offset and Instability Pe/Symbol ≈ 4.10 <sup>-2</sup> 8 ARY MFSK | 3-5  |
| 3-2    | Fine Hopping Synthesizer Block Diagram                                                        | 3-6  |
| 3-3    | Fine Hopping Synthesizer Ground and Space Versions                                            | 3-7  |
| 3-4    | Loss Due to Parabolic Phase Distortion and Finite Bandwidth                                   | 3-10 |
| 3-5    | Monolithic A/D Converter Technology                                                           | 3-13 |
| 3-6    | Growth in Chip Area                                                                           | 3-25 |
| 3-7    | Circuit Families Suitable for LSI/VL SI Implementation                                        | 3-29 |
| 3-8    | Circuit Families for Gigahertz Operation                                                      | 3-31 |
| 3-9    | Power vs Access Time                                                                          | 3-35 |
| 3-10   | LSI Memory Growth                                                                             | 3-38 |
| 3-11   | Cell/Bit Miniaturization                                                                      | 3-39 |
| A-1    | Typical Rippler Sequence                                                                      | A-2  |

# LIST OF TABLES

| Table |                                                     | Page |
|-------|-----------------------------------------------------|------|
| 2-1   | System Loading Assumptions                          | 2-5  |
| 2-2   | Processing/Memory Requirements                      | 2-6  |
| 2-3   | Bus Activity Estimates                              | 2-8  |
| 2-4   | Active Processors for Distributed System            | 2-14 |
| 2-5   | Distributed Processor Reliability                   | 2-18 |
| 2-6   | Active Processors for Hierarchical System           | 2-22 |
| 2-7   | Hierarchical Processor Reliability                  | 2-25 |
| 2-8   | Active Processors for Pipeline System               | 2-29 |
| 2-9   | Externally Redundant Pipeline Processor Reliability | 2-33 |
| 2-10  | Internally Redundant Pipeline Processor Reliability | 2-38 |
| 2-11  | LSI Requirements Summary                            | 2-39 |
| 3-1   | LSI Digital Correlator Chip Characteristics         | 3-11 |
| 3-2   | Semiconductor Technologies                          | 3-15 |
| 3-3   | "MOS" Circuit Performance vs Device Scaling         | 3-17 |
| 3-6   | Characteristics of VLSI Circuit Families            | 3-30 |
| 3-7   | Characteristics of High Speed Circuit Families      | 3-32 |
| 3-8   | Characteristics of Memory Technologies              | 3-37 |

| NTIS      | White Section           |
|-----------|-------------------------|
| DDC       | Buff Section            |
| UNANNOU   | NCED                    |
| JUSTIFICA | TION                    |
|           | TION/AVAILABILITY CODES |
| Dist.     | WAIL and/or SPECIAL     |
|           |                         |
| 1         |                         |

#### FOREWORD

This Final Report presents the results of the effort on Task No. 2, "Technological Limitations on Satellite Processor Application" of Study Contract DCA 100-78-C-0012 entitled "MILSATCOM Spacecraft Processing Study".

This study was performed for the Defense Communication Agency (DCA), Mr. Ronald P. Sherwin, Project Manager, by Raytheon Equipment Division, Communication Systems Directorate. The Contractor's activity was under the direction of A. A. Castro, Program Manager and has been conducted by the following personnel:

J. Eachus, F. Howes, E. Lewis and J. Stiffler.

#### ABSTRACT

This Final Report describes the technological limitations for performing on board signal processing in Military Communication Satellites, operational in the post 1990 time frame and in the UHF, SHF and EHF satellite communication bands. The subject of this Final Report is the second task of a Study Program to examine the basic functional requirements, technological constrains and architecture of MILSATCOM systems using satellite borne signal processors.

The analysis of the functional requirements for this Spacecraft Processor are given in Final Report ER78-4276 Task 1. "Functional Requirements for Satellite Processor". Task 2 of the Study extrapolates the limitations imposed by frequency of use, feasible bandwidths, processing gains, chip rates, baseband processing rates and reasonable weight/power/mission reliability, in the time frame of interest.

The on board signal processing configuration used to study the technological limitations, consisted of a Front End signal processor interfacing with the RF subsystem performing analog signal processing and converting the signals to digital form before further processing in a digital Communication Processor. The limitations of both analog and digital hardware, as well as digital processing architecture are investigated in this task.

0017036

# SECTION 1 INTRODUCTION

The objective of Task #2 of the MILSATCOM Spacecraft Processing Study is to investigate the fundamental technological limitations on satellite borne signal processing for operational use in the post 1990 time frame. The typical signal processing requirements for this Processing Satellite were studied during Task #1 and are addressed in the Final Report ER-4276 "Functional Requirements for Satellite Processors." The ability to fulfill these requirements will depend on limitations imposed by frequency of use, feasible bandwidths, processing gains, chip rates, baseband processing rates and reasonable weight/power limitations. The analysis and extrapolations of these limitations, for the time frame of interest, is the subject of this part of the Study.

Operational satellite processors in the post-1990 period will reflect the state-of-the-art technology of the mid-1980's. The specific quantitative amount of on-board digital processing possible in the mid 80's, expressed as achievable processing gain, chip rates, baseband processing bandwidth, message processing throughput, etc., will depend upon:

- Electronic payload size, weight, and prime power availability
- · Environmental conditions, including nuclear hardening
- Design life and mission reliability
- Limitations of the state-of-the-art of processor technology

For given size, weight, prime power, and mission reliability, the later consideration will result in specific limitations to analog and digital processing capabilities that may be implemented on-board a satellite. For example, for real time digital processing they will impact on:

- Maximum number of operations per unit of time (speed)
- Minimum processing cycle time (delay)

- Maximum memory size (memory)
- Maximum processed throughput (throughput)

which in turn will limit the signal processing capabilities just mentioned.

To realistically extrapolate the technological limitations on Satellite Processor applications, and considering the tremendous technological rate of evolution in fields such as digital signal processing and large scale integration, both the architectural and hardware aspects of Satellite Processors are discussed separately below.

#### 1.1 Satellite Processor Configuration

Figure 1-1 illustrates a possible configuration for a Satellite Processor which will be used to derive architectural and hardware limitations. The configuration is general enough to encompass all the applications investigated in Task #1 and specific cases will result in variances of the parameters of this general example.

It was assumed in this configuration the partition into a Main Processor, sometimes called Vehicle Control Processor, and subordinate Communication Processor. This partition may be done in the hardware or software of a digital processor, implemented as a redundant or a fault tolerant, specialized or general processor architecture. In addition, a Front End Signal Processor provides analog signal processing and the analog/digital interface between the Communication Processor and the RF subsystems (transmitters, receivers, antennas) aboard the spacecraft.

Typically, the Main Processor provides navigation, attitude, stationkeeping, antenna pointing, propulsion and thermal controls; solar panels, power and communication function activation; and, also, all executive routines, including spacecraft system autonomous operation, overall system fault tolerance, system recovery and reconfiguration. On the other hand, the Communication Processor is dedicated to the performance of the communication channel functions such as demodulation, decoding, switching and routing, formatting, mux and demux. The Front End Signal Processor performs IF processing, despreading, A/D conversion and downlink and crosslink remodulation.



Figure 1-1. Satellite Processor Configuration

#### SECTION 2

#### TECHNOLOGICAL LIMITATIONS OF SPACECRAFT PROCESSOR ARCHITECTURES

As previously defined, the on-board Signal Processor consists of a Front End Signal Processor interfacing with the RF Subsystem, performing the analog signal processing functions and converting the signals to digital form before further processing in the Communication Processor. The real time digital processing limitation of the Communication Processor, such as:

- Maximum number of operations per unit of time (speed)
- Minimum processing cycle time (delay)
- Maximum memory size (memory)
- Maximum processed throughput (throughput)

will, by large, limit the amount of signal processing expressed as chip rates, baseband rates, message throughput, etc., for given constrains of weight, size, mission reliability and environmental conditions, including hardening.

These limitations will result from device technology intrinsic parameters, such as speed, gate and I/O density, power consumption, hardness, etc., when considered in the architecture of the processor itself. On the other hand, device limitations influence the comparative attractiveness of the various processor architectures and the relative advantage of one technology over another depends on how the devices are to be used. Accordingly, it is necessary to identify those architectures that are most effective for communication processor applications and to evaluate each of them in conjunction with the limitations imposed by the projected device technologies.

This section addresses alternate processor architectures and their limitations for spacecraft communication signal processors. These architectures are evaluated for spacecraft relevant parameters, such as mission reliability, weight, volume, power and environmental conditions (including nuclear hardening). In

order to make this evaluation parameters, arbitrary signal processing loads and mission reliability figures have been used. The assumptions and limitations on the device technology of the processor building blocks are discussed in Section 3.

#### 2.1 Architectural Trade Offs

Processor architectures can be divided into two general categories: centralized and distributed. Centralized architectures consist of a signal processing unit interconnected through its bus system with a memory system and various input/output ports. The operating speed of central processing units has increased dramatically over the past several decades, so it is not inconceivable that a central processor could be developed that would be able to meet the projected signal processor throughput requirements.

The present trend, however, is toward the development of LSI devices capable of implementing more and more central processor functions on a single chip. This trend is, of course, highly desirable, at least in the space environment, since it greatly reduces the weight and volume associated with a given processing unit. Nevertheless, because this "processor-on-a-chip" emphasis in LSI development tends to produce devices that are more compact rather than devices that have significantly greater throughput, a distributed architecture exploiting the availability of these small, independent processing units appears, at this point, to be more promising that the centralized approach.

Distributed architectures, too, can be subdivided into two broad categories: hierarchical and peer architectures. The first term refers to those organizations in which one processor is assigned control over all other processors, assigning them tasks and directing interprocessor communication. The processors under control of the "master" processor may in turn control other processors at a lower level in the hierarchy. Usually communication takes place only between a processor and its controlling processor (at the next higher level in the hierarchy) and between the later processor and those processors under its control (at the next lower level).

Peer structures represent the other extreme in distributed architectures. In such a structure, no processor has control over any other; generally any processor in the system can communicate with any other. In practice most distributed architectures fall somewhere between these two extremes.

#### 2.1.1 Evaluation Criteria

The evaluation approach is to select a range of processing requirements and apply them to several competing processor structures. The results will show which architecture can provide the long-term performance required while maintaining a high level of efficiency in weight, volume and power.

A measure of the efficiency of the various architectures can be obtained by comparing each alternative structure at specified levels of reliability. Two reliability levels have been selected for this evaluation:

- 1. 95% probability of nondegraded operation in space for five years.
- 2. 95% probability of nondegraded operation in space for ten years.

#### 2.1.2 Fault Tolerance Goals

The fault tolerance goals of the spacecraft communications processor are primarily constrained by the reliability goals of 95% probability of survival after five or ten years. In addition, the elimination of single-point failures has also been adopted as a system design goal. The elimination of single point failures implies the use of redundancy in each functional area of the processing system. Not only is it necessary to provide redundancy for the processing elements in the system, but also the bus interconnections must be redundant.

#### 2.1.3 System Loading Requirements

Four different sets of processing requirements were chosen as examples for use in the processor architecture evaluation. Each alternative architecture is then evaluated for each of the four examples, thereby providing a measure of the flexibility of the system. This is a necessary approach since changes will occur in the processing requirements of communications systems as requirements details become better defined. Selection of an architecture which is modular and flexible improves the usefulness of the system in a wide variety of space communication processor applications.

The four examples of processing requirements are listed in Table 2-1.

Three types of channels are postulated: uplink, crosslink and downlink. For uplink, the capacity ratio between examples 1 and 4 is 1:16. This is an adequate test of the flexibility of the proposed architectures.

#### 2.1.4 Processor Word Length

Each of the processor alternatives are composed of one or more general-purpose processors connected by a busing structure. The length of the data word for these processors has been studied in terms of the effect of the quantization noise. The results of this study indicate that an 8-bit data word is adequate for all identified tasks of the processing system. If additional processing tasks indicate that a 12 or 16 bit data word is required, then corresponding modifications will be made in the individual processor designs.

#### 2.1.5 Processor Sizing

Functionally, the processing requirements can be partitioned into the following classes:

- Demodulator
- Interleaver/Deinterleaver
- Encoder/Decoder

The demodulation function requires the greatest amount of throughput. In order to improve the throughput of the demodulator, each processor implementing a demodulating function will consist of two core processors each capable of independent processing. The memory processing requirements for the demodulator are 512 9-bit words.

The interleaver/deinterleaver function does not have a significant throughput requirement, but it does require a 64K, 24-bit memory.

The processing and the memory requirements of the encoder/decoder function do not constrain the design of the processor.

The uplink data on which the sizing is based are shown in Table 2-2. Processing speeds are given in terms of the number of equivalent additions plus the

Table 2-1. System Loading Assumptions

# UPLINK

FDMA, 8-ary FSK 4 chip diversity, 75 bps coded rate 1/2 dual 3 block interleaved synchronized to satellite timing, 5 ms symbol or chip

# CROSSLINK

QPSK, full duplex

## DOWNLINK

TDM

|           | Example 1  | Example 2  | Example 3   | Example 4   |
|-----------|------------|------------|-------------|-------------|
| UPLINK    | 5 links -  | 10 links - | 5 links -   | 10 links -  |
|           | 7 channels | 7 channels | 56 channels | 56 channels |
|           | each       | each       | each        | each        |
| CROSSLINK | 1 link -   | 1 link -   | 1 link -    | 1 link -    |
|           | 2400 bps   | 4800 bps   | 9600 bps    | 9600 bps    |
| DOWNLINK  | 1 link -   | 1 link -   | 2 links -   | 4 links -   |
|           | 2400 bps   | 4800 bps   | 9600 bps    | 9600 bps    |

Table 2-2. Processing/Memory Requirements

| Uplink                                                                  | Example 1           | Example 2           | Example 3           | Example 4           |
|-------------------------------------------------------------------------|---------------------|---------------------|---------------------|---------------------|
| Demodulator (dual) (Each half processor Processing N KOPS/s*) (A+M/sec) | 1,883 KOPS/s        | 3,778 KOPS/s        | 21,680 KOPS/s       | 43,360 KOPS/s       |
| Proc. Req'd.                                                            | 1                   | 1                   | 4                   | 7                   |
| Mem/proc. Req'd.                                                        | 320 words<br>9 bits | 320 words<br>9 bits | 366 words<br>9 bits | 366 words<br>9 bits |
| Memory/proc.                                                            | 512 x 9             | 512 x 9             | 512 x 9             | 512 x 9             |
| Interleaver/<br>Deinterleaver                                           |                     |                     |                     |                     |
| Bytes                                                                   | 5,250               | 10,500              | 42,000              | 84,000              |
| Memory Req'd.<br>(1Kx28b/ch)                                            | 35K                 | 70K                 | 280K                | 560K                |
| Proc/Req'd.<br>(64Kx28b/proc)                                           | 1                   | 2                   | S                   | o                   |
| Encoder/Decoder                                                         |                     |                     |                     |                     |
| Bits/s                                                                  | 2,625               | 5,250               | 21,000              | 24,000              |
| Dual 3 Viterbi<br>(proc=10,500b/s)<br>Proc. req'd.                      | 1                   | г                   | 7                   | 4                   |
|                                                                         |                     |                     |                     |                     |

\*The number of operations is the sum of the equivalent additions plus the multiplications.

(Sales)

number of multiplications required per second. In calculating the number of processors required, the throughput of each processor is projected to be 6.67 mega-operations per second.

#### 2.1.6 Bus Activity Data

In this study, processors are allocated one of the following functions:

- A/D conversion
- Demodulator processing
- Interleaver/deinterleaver processing
- Encoder/decoder processing
- Modulator processing

In order to determine the bus usage for the alternate systems under consideration, estimates have been made of the bus activity among the processing units. Table 2-3 lists the estimated bus activity for the flow of data through the system. This data will be used to identify potential bus bottlenecks for the alternate systems.

#### 2.1.7 Core Processor Description

The Core Processor is the basic building block for the processors used during this study. A block diagram of this processor is shown in Figure 2-1.

The Core Processor is an 8-bit processor with dual 8 x 8 multipliers and an 8-bit ALU. The processor has internal storage capacity of 512 9-bit words (8 bits plus overflow). The address of the register array is controlled by the Address Generator which can either sequence the addresses by a predetermined algorithm or can allow direct access to the array.

Control over the processor is maintained by a 256 word, 64 bit RAM. This control RAM normally operates as a control ROM with a single instruction which uses the 256 words for microprogram control. The primary difference between this control RAM and a normal processor's control ROM is that the control RAM can be loaded externally from the Bus Interface port. This allows each identical core

Table 2-3. Bus Activity Estimates (in Byte Transfers)

| Processor Units                                        | Example 1 | Example 2 | Example 3 | Example 4 |
|--------------------------------------------------------|-----------|-----------|-----------|-----------|
| A/D Demodulator                                        | 133 KHz   | 266 KHZ   | 1,062 KHz | 2,125 KHz |
| Demodulator -Interleaver/Deinterleaver                 | 50 KHZ    | 100 KHz   | 398 KHz   | 797 KHZ   |
| <pre>Interleaver/Deinterleaver — Encoder/Decoder</pre> | 10 KHz    | 20 KHZ    | 80 KHZ    | 160 KHz   |
| Encoder/Decoder Modulator                              | 5 KH2     | 10 KHZ    | 40 KHz    | 80 KHz    |

1

Participant and a second

- Commen

3

3

1

A COLUMN

3

1

8



Figure 2-1. Core Processor Block Diagram

processor in the system to be assigned in individual task by the supervisor or Main Processor. As a result, each processor may execute a different algorithm but still be spared by the same type processor.

The External RAM control provides address and data control over an external 64K word 24-bit RAM. A parity bit is generated or checked on each memory access. Control is also maintained over the state of the Bit Rippler chip which allows access to three spare bits in the external 64K word RAM.

#### 2.1.8 Bus Interface

The Bus Interfaces among processors are important in maintaining the reliability and fault tolerance goals of the system. Each bus is composed of an 8-bit address byte, an 8-bit data byte, a parity bit, a spare byte and the control lines necessary to transfer data over these buses. The redundancy for the address, data and parity is provided by the spare byte. Sparing is accomplished by byte rippling. The Bus Controller or Main Processor identifies which byte is to be spared to all processors on the bus. All control and status lines are triplicated and voted on by the receiver.

Each processor interfaces to two buses via the Bus Interface LSI device. The Bus Interface chip provides the byte isolation, rippling and bus drive for both buses. It also provides fault-tolerant control over the buses maintaining isolation between logic which could otherwise result in an entire bus to fail because of correlated failure mechanisms.

The Bus Interface provides capability to interface to either one or two
Core Processor chips. Data transfers can be made between the Core Processor's
Register Array or the Control RAM and the Bus Interface chip.

-

3

The hazard rate of a bus is the sum of the bus hazard rates for each unit on the bus. Half the hazard rate for each Bus Interface chip is allocated to the bus. Since there are two buses connected to each Bus Interface chip and four bytes per interface, one sixteenth the hazard rate of one chip is allocated to a byte. The chip hazard rate has been projected to be  $2 \times 10^{-7}$  failures per hour. The hazard rate for one byte will be  $0.125 \times 10^{-7}$  per hour. Three bytes of the four bytes in each bus must survive for all the units on the bus in order to have an operable bus.

In those cases where the reliability goal cannot be achieved with a single spare byte bus system, a second spare byte is added using the same hazard rate of  $0.125 \times 10^{-7}$  per hour. This will add an equivalent number of LSI chips to the system equal to the number of processors on the bus divided by eight.

#### 2.1.9 Technology Assumptions

The processor is projected to be deployed by 1990. The LSI technology in which the system will be based will be 1983-1985 technology discussed in Section 3.2 of this Report.

For this study these projected LSI technology assumptions are as listed below:

- 10,000 20,000 gates per LSI device
- 100 I/O pads per LSI device
- 128K 256K by one bit RAM size
- · Capability to electrically isolate circuits on the same ISI device
- Each LSI device will have a hazard rate of 2 x 10<sup>-7</sup>/hour

These technology projections have not all been fully utilized for this processor study. The only characteristic which has been fully applied is the 100 I/O pads per device. This study utilizes 64K words by 1 bit RAMs. The hazard rate assumed for the LSI devices is considered reasonable for the complexity of the devices and the number of I/O pads. This figure may, in fact, be conservative as the technology requirements of the processing system does not fully use the projected capabilities of the technology.

## 2.1.10 Alternate Processor Architectures

Four alternate processor configurations are the basis of the evaluation:

- 1. Distributed Processing System
- 2. Hierarchical Processing System

- 3. Externally Redundant Pipeline Processing System
- 4. Internally Redundant Pipeline Processing System

Each of these configurations is described in the following sections.

#### 2.2 Distributed Processing System

The Distributed Processing system, shown in Figure 2-2, uses a single design for all processing functions. This signal processing unit, together with an Analog-to-Digital conversion unit and a Modulator unit, make up the basic building blocks of the system.

The processors are interconnected by three buses: the first connects the Main Processor to the signal processors; the second is an interprocessor bus which interconnects all signal processors; and the third is an I/O bus which connects all processing units with the A/D and Modulator units.

Table 2-4 lists the active processing units required by the distributed system to implement the system loading examples 1 to 4 (cf. Table 2-1).

The A/D and Modulators are dedicated to communication link and are, therefore, individually spared. A spares pool can be used for the remaining processors.

#### 2.2.1 General-Purpose Processor Description

The general-purpose processing units are all identical. Any processor can be assigned any task including the supervisory task. The Main Processor will assign the tasks to the individual processing units. A block diagram for the processing unit is shown in Figure 2-3. Since this processor must perform a variety of tasks, it is more complex than if special-purpose processors were used.

This processing unit consists of two Core Processors operating in parallel: a 64K word, 28 bit RAM and interfaces to the three buses. Each Core Processor is a single LSI device. (The Core Processor is described in Section 2.1.7). Each of the two interfaces is also an LSI device.

The Bit Ripple LSI device is used to reduce the failure rate of the processor. The functional memory requirements are 64K words and 24 bits. With the

Figure 2-2. Distributed Processing System

Table 2-4. Active Processors for Distributed System

| <u>Example</u>                                                        | 1<br>Active | 2<br>Active | 3<br>Active | 4<br>Active |
|-----------------------------------------------------------------------|-------------|-------------|-------------|-------------|
| A/D                                                                   | 15-         |             |             |             |
| Uplink                                                                | 5           | 10          | 5           | 10          |
| Crosslink                                                             | 1           | 1           | 2           | 4           |
| Demodulator                                                           |             |             |             |             |
| Uplink                                                                | 1           | 1           | 4           | 7           |
| Crosslink                                                             | 1           | 1           | 1           | 1           |
| Interleaver/<br>Deinterleaver                                         |             |             |             |             |
| Uplink (De)                                                           | 1           | 2           | 5           | 9           |
| Crosslink (De)                                                        | 0           | 0           | 0           | 0           |
| Downlink                                                              | 1           | 1           | 1           | 1           |
| Encoder/Decoder  Uplink (De)  Crosslink (De)  Downlink/Crosslink (En) | 2           | 2           | 2           | 3           |
| Modulator                                                             |             |             |             |             |
| Crosslink                                                             | 1           | 1           | 2           | 4           |
| Downlink                                                              | 1           | 1           | 2           | 4           |
| Supervisor Processor                                                  | 1           | 1           | 1           | 1           |
| Total Active A/Ds                                                     | 6           | 11          | 7           | 14          |
| Total Active G. P. Processors                                         | 7           | 8           | 13          | 22          |
| Total Active Modulators                                               | 2           | 2           | 4           | 8           |



Figure 2-3. General Purpose Processor for Distributed System

addition of a parity bit and three spare bits, memory failures can be detected by the Core Processor and the faulty bit can be "rippled" out by the Bit Rippler. This technique allows the effective reliability hazard rate for the processor to be significantly reduced as the memory's contribution to the processor hazard rate can be replaced by the Bit Ripple's hazard rate. Thus, the total failure rate is reduced from  $66 \times 10^{-7}$  to  $10 \times 10^{-7}$ . (The actual hazard rate used for the processor is  $9 \times 10^{-7}$  as one-half the Bus Interface chip is allocated to the Interprocessor and I/O buses.) An explanation of the operation of the Bit Ripple is given in Appendix A.

#### 2.2.2 I/O Units

The I/O Units of the Distributed Processing system are the A/D and Modulator modules. These units are dedicated to receivers or transmitters and cannot utilize pooled spares. Each I/O unit requires one (or more) dedicated spares.

The A/D conversion unit (Figure 2-4) converts analog data to 8-bit digital data. Each A/D unit is composed of one Bus Interface Chip and one A/D converter.

The Modulator (Figure 2-5) processor converts digital baseband data to modulated analog data for transmission. The Modulators can be dedicated units each designed for a different type of modulator or can be designed as universal units with capability of being programmed to adapt to a variety of modulation techniques. For this study a programmable modulator is proposed which is of the complexity of two LSI chips. These together with the Bus Interface Chip comprise the Modulator Processor.

The hazard rate for the A/D unit is  $3 \times 10^{-7}$  per hour. The hazard rate for the Modulator unit is  $5 \times 10^{-7}$  per hour.

# 2.2.3 Distributed Processor Reliability

A goal for this study is the development of a processing system which will have a 0.95 probability of nondegraded operation in space after 5 or 10 years. A summary of the results for the distributed processing system is provided in Table 2-5.



Figure 2-4. A/D Converter Unit



Figure 2-5. Modulator Processor

Table 2-5. Distributed Processor Reliability

|               | <u>A/I</u>  | )'s       | General-<br>Proces | -         | Modula     | ators | Ps    |
|---------------|-------------|-----------|--------------------|-----------|------------|-------|-------|
|               | Active      | Spare     | Active             | Spare     | Active     | Spare |       |
| Example 1     | 6           | 6         | 7                  | 1         | 2          | 2     | .9610 |
| Example 2     | 11          | 11        | 8                  | 2         | 2          | 2     | .9889 |
| Example 3     | 7           | 7         | 13                 | 2         | 4          | 4     | .9750 |
| Example 4     | 14          | 14        | 22                 | 3         | 8          | 8     | •9750 |
| Optimized for | : 10 year m | mission ( | with 2 spa         | are bytes | on I/O bus | 5)    |       |
| Example 1     | 6           | 6         | 7                  | 2         | 2          | 2     | .9652 |
| Example 2     | 11          | 11        | 8                  | 3         | 2          | 2     | .9525 |
| Example 3     | 7           | 7         | 13                 | 3         | 4          | 4     | .9554 |
| Example 4     | 14          | 14        | 22                 | 6         | 8          | 8     | 9598  |

The four examples listed are the four system loading examples defined in Table 2-1. The number of active processors required are listed in Table 2-4. Table 2-5 gives the probability of success (P<sub>S</sub>) for the distributed processing system for the given combination of active A/D conversion units, general-purpose processors and modulators. The reliability of the three-active one-spare byte buses have been included in the calculations for the 5 year mission. For each example, the system was optimized to provide the smallest number of spares consistent with the reliability goals.

For the 10 year mission, the Example 4 case could not achieve the reliability goal of 0.95 probability of mission success with a single spare byte on all the buses. Therefore, for the 10 year mission, a second spare byte was provided on the I/O bus which has the greatest number of units connected to it.

#### 2.2.4 Bus Activity for Distributed System

Table 2-3 lists estimates for the bus activity for the various processing functions. For the distributed system, the effect of bus activity is significant only for the I/O bus which interconnects the A/D's and the Modulators with the general-purpose processors. The I/O bus must, therefore, process data at the sum of the A/D to demodulator rates plus the encoder/decoder to modulator rates. This results in the following bus activity requirements for the I/O bus:

| Example 1 | Example 2 | Example 3 | Example 4 |
|-----------|-----------|-----------|-----------|
| 7.1 µs    | 3.6 µs    | 0.91 µs   | 0.45 us   |

Example 4 requires a word transfer every 0.45  $\mu$ s. This will require a careful bus design because the example 4 (10 year) system has 82 units on the I/O bus (cf. Table 2-5).

One technique which can reduce the time constraint on the bus is to have the address on the bus relate to the next transfer's data. Although this complicates the system somewhat, it may be the only way to achieve the throughput with the distributed system.

#### 2.3 Hierarchical Processing System

The hierarchical processing system is shown in Figure 2-6. This system differs from the distributed system in the following ways:

#### Hierarchical System

#### Distributed System

- Dedicated Control Processor
- Any G. P. processor can be supervisor

• Single 16-bit bus

• Three 8-bit buses

The Control Processor, the A/Ds and the Modulators all have dedicated spares while the general-purpose processors utilize pooled spares. The Control Processor can be specialized to its required function, that of interfacing to the Main Processor and controlling the A/Ds, general-purpose processors and the modulators.

Sixteen bits are required for both address and data buses. A spare byte is provided for the combination address-data bus.

Table 2-6 is a list of the active processing units required by the hierarchical system to implement the loading examples 1-4 (cf. Table 2-1).

# 2.3.1 Processor Descriptions for Hierarchical System

The hierarchical system general-purpose processor is shown in Figure 2-7. The Bus Interface provides the interface to the 16-bit bus. This processor consists of a Bus Interface, two Core Processors, a Bit Ripple and 28 64K by 1 bit RAMs. The hazard rate for this processor is  $7 \times 10^{-7}$  per hour.

The A/D and Modulator units are as shown in Figures 2-4 and 2-5 with the exception that the Bus Interface provides the interface with a single 16-bit bus rather than two 8-bit buses. The hazard rate for the A/D unit is  $3 \times 10^{-7}$  per hour. The hazard rate for the Modulator unit is  $5 \times 10^{-7}$  per hour.

The Control Processor is shown in Figure 2-8. It differs from the general processor in that it has a Main Memory Interface and only one Core Processor. The hazard rate for this processor is  $7 \times 10^{-7}$  per hour.



Figure 2-6. Hierarchical Processing System

Table 2-6. Active Processors for Hierarchical System

| <u>Example</u>                | 1<br>Active | 2<br>Active | 3<br>Active | 4<br>Active |
|-------------------------------|-------------|-------------|-------------|-------------|
| A/D                           |             |             |             |             |
| Uplink                        | 5           | 10          | 5           | 10          |
| Crosslink                     | 1           | 1           | 2           | 4           |
| Demodulator                   | 1           | 1           | 4           | 7           |
| Uplink                        | 1           | 1           | 1           | 1           |
| Crosslink                     | 1           | 1           | 1 '         | 1           |
| Interleaver/<br>Deinterleaver |             |             |             |             |
| Uplink (De)                   | 1           | 2           | 5           | 9           |
| Crosslink (De)                | 0           | 0           | 0           | 0           |
| Downlink                      | 1           | 1           | 1           | 1           |
| Encoder/Decoder               |             |             |             |             |
| Uplink (De)                   | , 1         | 1           |             |             |
| Crosslink (De)                |             |             | 2           | 3           |
| Downlink/Crosslink            | 1           | 1           |             |             |
| Modulator                     |             |             |             |             |
| Crosslink                     | 1           | 1           | 2           | 4           |
| Downlink                      | 1           | 1           | 2           | 4           |
| Control Processor             | 1           | 1           | 1           | 1           |
| Total Active A/D              | 6           | 11          | 7           | 14          |
| Total Active G. P. Processors | 6           | 7           | 13          | 21          |
| Total Active Modulators       | 2           | 2           | 4           | 8           |
| Total Control Processors      | 1           | 1           | 1           | 1           |



Figure 2-7. General Purpose Processor for Hierarachical System

Each bus byte is allocated  $0.125 \times 10^{-7}$  per hour hazard rate per processor. Five of the seven bytes in each bus must survive for all units on the bus in order to have an operable bus.

## 2.3.2 Hierarchical Processor Reliability

The reliability summary for the hierarchical system is shown in Table 2-7 for the four system loading examples. The number of active processors required are listed in Table 2-6. In order to achieve the reliability goals for examples 2, 3 and 4, all examples are assumed to have two spare bytes (with a hazard rate of  $0.125 \times 10^{-7}$ ) per hour per byte per processor).

## 2.3.3 Bus Activity for the Hierarchical System

Table 2-3 lists estimates for the bus activity for the processing functions. Since the hierarchical system data bus is 16 bits wide, the number of transfers required will be one-half the sum of the column in Table 2-3. This results in the following bus activity requirements for the bus.



Figure 2-8. Control Processor for Hierarchical System

Table 2-7. Hierarchical Processor Reliability

| Optimized for 5-year mission  |            |            |        |                                       |        |       |       |  |  |  |  |  |
|-------------------------------|------------|------------|--------|---------------------------------------|--------|-------|-------|--|--|--|--|--|
|                               | <u>A/I</u> | <u>)'s</u> |        | General-Purpose Processors Modulators |        |       |       |  |  |  |  |  |
|                               | Active     | Spare      | Active | Spare                                 | Active | Spare |       |  |  |  |  |  |
| Example 1                     | 6          | 6          | 6      | 1                                     | 2      | 2     | .9808 |  |  |  |  |  |
| Example 2                     | 11         | 11         | 7      | 1                                     | 2      | 2     | .9738 |  |  |  |  |  |
| Example 3                     | 7          | 7          | 13     | 2                                     | 4      | 4     | .9863 |  |  |  |  |  |
| Example 4                     | 14         | 14         | 21     | 2                                     | 8      | 8     | .9607 |  |  |  |  |  |
| Optimized for 10-year mission |            |            |        |                                       |        |       |       |  |  |  |  |  |
|                               |            |            |        |                                       |        |       |       |  |  |  |  |  |
| Example 1                     | 6          | 6          | 6      | 2                                     | 2      | 2     | .9781 |  |  |  |  |  |
| Example 2                     | 11         | 11         | 7      | 2                                     | 2      | 2     | .9708 |  |  |  |  |  |
| Example 3                     | 7          | 7          | 13     | 3                                     | 4      | 4     | .9699 |  |  |  |  |  |
| Example 4                     | 14         | 14         | 21     | 5                                     | 8      | 8     | .9583 |  |  |  |  |  |

| Example 1 | Example 2 | Example 3 | Example 4 |
|-----------|-----------|-----------|-----------|
| 10 µs     | 4.1 µs    | 1.3 µs    | 0.63 µs   |

# 2.3.4 Externally Redundant Pipeline Processing System Description

This system (Figures 2-9 and 2-10) differs from the previously addressed system alternatives in the following ways:

- Sparing is accomplished external to the system of processors.
- Subprocessors are adapted to function there are three types of processors vs. one for the previously discussed systems.
- There are four distinct buses-the Bus Controller connects to all buses; other processors connect to one or two buses.

Data flow in the processors is from the A/D conversion units to the Demodulators, to the Interleavers/Deinterleavers to the Encoders/Decoders to the Modulators (from left to right in Figure 2-10). The four buses serve to alleviate bus congestion and reduce the loading on the bus(es).

Table 2-8 lists the active processing units required by this sytem to implement the system loading examples 1 to 4 (cf. Table 2-1).

# 2.4 Processing Units for Externally Redundant Pipeline System

The three processors used within the pipeline system are:

Demodulator

Interleaver/Deinterleaver

Encoder/Decoder

These three match with the processor requirements listed in Table 2-8. Figure 2-11 is a block diagram of each of the three processors. Note that each uses the standard LSI devices: the Core Processor and the Bus Interface.

Since there is no redundancy within the processor system, the effective hazard rate has been slightly reduced for the processors. The hazard rates are listed below:



Figure 2-9. External Redundant Pipeline Processing System



Figure 2-10. Externally Redundant Pipeline Processor System

Table 2-8. Active Processors for Pipeline System

| Example_                      | 1<br>Active | 2<br>Active | 3<br>Active | 4<br>Active |
|-------------------------------|-------------|-------------|-------------|-------------|
| A/D                           |             |             |             |             |
| Uplink                        | 5           | 10          | 5           | 10          |
| Crosslink                     | 1           | 1           | 2           | 4           |
| Demodulator                   |             |             |             |             |
| Uplink                        | 1           | 1           | 4           | 7           |
| Crosslink                     | 1           | 1           | 1           | 1           |
| Interleaver/<br>Deinterleaver |             |             |             |             |
| Uplink (De)                   | 1           | 2           | 5           | 9           |
| Crosslink (De)                | 0           | 0           | 0           | 0           |
| Downlink                      | 1           | 1           | 1           | 1           |
| Encoder/Decoder               |             |             |             |             |
| Uplink (De)                   | 1           | 1           | 2           | 4           |
| Crosslink (De)                | 1           | 1           | 1           | 1           |
| Downlink/Crosslink            | 1           | 1           | 1           | 1           |
| Modulator                     |             |             |             |             |
| Crosslink                     | 1           | 1           | 2           | 4           |
| Downlink                      | 1           | 1           | 2           | 4           |
| Bus Controller                | 1           | 1           | 1           | 1           |





## INTERLEAVER/DEINTERLEAVER PROCESSOR



ENCODER/DECODER PROCESSOR

Figure 2-11. Processor for Pipeline System

| A/D                       | $2.9 \times 10^{-7}$ per hour |
|---------------------------|-------------------------------|
| Demodulator               | $4.9 \times 10^{-7}$ per hour |
| Interleaver/Deinterleaver | $4.9 \times 10^{-7}$ per hour |
| Encoder/Decoder           | $2.9 \times 10^{-7}$ per hour |
| Modulator                 | $4.9 \times 10^{-7}$ per hour |

The Bus Controller shown in Figure 2-12 provides the interface between the Main Memory and the processing elements of the system. The Bus Control chip contains the memory control and bit ripple and the processing element and the interface to the Main Processor. The Controller Bus Interface Chips, in addition to providing interfaces to the buses, also provide a cross bus capability which allows a bus to be bypassed in the data flow from A/D's to Modulators. The hazard rate for the bus controller is  $4.3 \times 10^{-7}$  per hour.

## 2.4.1 Externally Redundant Pipeline Processor Reliability

The summary of the reliability analysis for this system is shown in Table 2-9 for the four system loading examples. The number of active processors required within a processor configuration is listed in Table 2-8. There is no redundancy within the processor system or on its buses. All redundancy is provided external to the processor group.

The results show that from 2 to 17 spare processing configurations are required to achieve the reliability goals. This assumes no overhead in the Main Processor for interfacing with and controlling the spare processing systems. Clearly, this approach is much more expensive in terms of power, weight and volume as compared with the other alternatives.

# 2.4.2 Bus Activity for the Externally Redundant Pipeline Processing System

The bus activity estimates (cf. Table 2-3) for the various processing functions show that the A/D to demodulator bus is the most congested for this system. The bus activity for this system is, therefore, as shown below:

| Example 1 | Example 2 | Example 3 | Example 4 |
|-----------|-----------|-----------|-----------|
| 7.4 µs    | 3.8 µs    | 0.94 µs   | 0.47 µs   |



Figure 2-12. Bus Controller for Externally Redundant System

Table 2-9. Externally Redundant Pipeline Processor Reliability

|           | Active Proce  | ssor System     | Spare Processor Systems | Ps    |
|-----------|---------------|-----------------|-------------------------|-------|
|           | Optimized for | 5-year Mission  |                         |       |
| Example 1 |               | ı               | 2                       | .9832 |
| Example 2 |               | 1               | 2                       | .9659 |
| Example 3 |               | ı               | 3                       | .9698 |
| Example 4 |               | ı               | 5                       | .9509 |
|           |               |                 |                         |       |
|           |               |                 |                         |       |
|           | Optimized for | 10-year Mission |                         |       |
| Example 1 |               | L               | 3                       | .9601 |
| Example 2 |               | L               | 4                       | .9526 |
| Example 3 |               | l               | 7                       | .9639 |
| Example 4 |               |                 | 17                      | .9526 |

These requirements are not as severe as for the previously discussed systems, as the maximum number of processors on this system's A/D to demodulator bus is only 22 as compared to 73 and 70 for the previously addressed systems.

#### 2.5 Internally Redundant Pipeline Processing System Description

This system (Figure 2-13) is the same as the Externally Redundant Pipeline system except that subelement redundancy is employed. The processors and buses internal to the system are provided individually with spares. This system is more efficient in total hardware required than the externally redundant system. At the same time, it keeps the number of processors on each bus to a reasonable number.

The active processors required for the internally redundant system is the same as for the externally redundant system (cf. Table 2-8).

### 2.5.1 Processing Units for Internally Redundant Pipeline System

As with the externally redundant pipeline system, the processors are configured based on the three basic processing functions: demodulator, interleaver/deinterleaver and encoder/decoder. Figure 2-10 shows the differences among the three processors. The Encoder/Decoder is the basic 8-bit processor. The Demodulator has dual processing elements and the Interleaver/Deinterleaver has additional memory. The hazard rate for each processors is a little higher for this system as some provisions for fault tolerance must be made within the processors. The hazard rates are listed below:

| A/D                       | $3 \times 10^{-7}$ per hour   |
|---------------------------|-------------------------------|
| Demodulator               | $5 \times 10^{-7}$ per hour   |
| Interleaver/Deinterleaver | $5 \times 10^{-7}$ per hour   |
| Encoder/Decoder           | $3 \times 10^{-7}$ per hour   |
| Modulator                 | 5 x 10 <sup>-7</sup> per hour |

The Bus Controller for this system is shown in Figure 2-14. The hazard rate for the Bus Controller is  $5 \times 10^{-7}$  per hour.



Figure 2-13. Internally Redundant Pipeline Processing System



Figure 2-14. Bus Controller for Internally Redundant Pipeline Processing System

### 2.5.2 Internally Redundant Pipeline Processor Reliability

Table 2-10 is a summary of the reliability data for the internally redundant pipeline processing system. The spares required to achieve the reliability goals for the four loading examples are listed.

The reliability goals were achieved for all loading cases except example 4 for a 10-year mission. The probability of success for this case was .948. This can be increased to .95 by the addition of a second spare byte on the A/D to demodulator bus. A second technique of increasing the probability of mission success would be by multiplexing analog inputs to the A/D conversion units. This would reduce the total number of A/D units and would consequently reduce the number of units on the buses.

#### 2.6 Summary

The purpose of this study was to evaluate four representative processing architectures for a range of processing requirements. Each alternate architecture was sized to meet each of four processing requirements and two reliability requirements. The number of LSI devices required for each of the eight sets of requirements are listed in Table 2-11. The Internally Redundant Pipeline Processor architecture required fewer total LSI devices than the other three systems for all eight cases. The number of LSI devices required in each case was significantly lower for the Internally Redundant Pipeline system. The Externally Redundant Pipeline system consistently required the greatest number of LSI devices. The advantage of the Internally Redundant System is the greatest for example 4, the loading example with the largest processing requirements. The advantage is also greater for 10-year missions versus 5-year missions.

Weight, power and volume are all proportional, to a large extent, on the number of LSI devices required. The Internally Redundant Pipeline System will, therefore, be the lightest, lowest power and smallest volume system of the four alternatives.

Figures 2-15 and 2-16 give a parametric representation of the estimated weight, volume and number of LSI chips (complexity) as a function of the information throughput.

Table 2-10. Internally Redundant Pipeline Processor Reliability

|                                          | -       | -              |        |       |              |       |              |       |          |       |                         |      |       |
|------------------------------------------|---------|----------------|--------|-------|--------------|-------|--------------|-------|----------|-------|-------------------------|------|-------|
|                                          | Bus Con | Bus Controller | A/D's  | 5,0   | Demodulators | ators | Interleaver/ | aver/ | Encoder/ | ler/  |                         |      |       |
|                                          | Active  | Spare          | Active | Spare | Active       | Spare | Active       | Spare | Active   | Spare | Modulator<br>Active Spa | ator | 4     |
| Optimized<br>for 5 years                 |         |                |        |       |              |       |              |       | 6        |       |                         |      |       |
| Example 1                                | 7       | 1              | 9      | 9     | 7            | -     | 74           | 1     | m        | 1     | ^                       | ,    | 3000  |
| Example 2                                | -       | 1              | n      | п     | 7            | 1     | m            |       | e        | 1.    | . ^                     | ٠,   | 1366  |
| Example 3                                | -       | 1              | 7      | 7     | S            | п     | 9            | -     | 4        | 1     | . 4                     |      | 679   |
| Example 4                                | 7       | 7              | 14     | 7     | 80           | ~     | 01           | -     | 9        | 1     | æ                       | · @  | 9595  |
|                                          |         |                |        |       |              |       |              |       |          |       |                         |      |       |
| Optimized<br>for 10 years                |         |                |        |       |              |       |              |       |          |       |                         |      |       |
| Example 1                                | 7       | 1              | 9      | 9     | 7            | 1     | 7            |       |          |       | ,                       |      |       |
| Example 2                                | 7       | 1              | п      | 11    | 7            |       | m            |       | , "      |       | ۷ ،                     | , ,  | 0696  |
| Example 3                                | 7       | 1              | 7      | 7     | s            | 7     | o            | . 7   | . 4      |       | ۸ 4                     | , 4  | 1/56. |
| Example 4                                | 7       | 1              | 14     | 14    | 80           | £     | 10           | е     | 9        | 2     | . 00                    | · œ  | 9480  |
| Example 4a (with 2 spare bytes on a/n co | r       |                | 77     | 77    | 80           | e .   | 10           | n     | 9        | ~     | <b>6</b> 0              | . ,  | .9534 |
| Bus Controller<br>bus)                   |         |                |        |       |              |       |              |       |          |       |                         | i.i. |       |
|                                          |         |                |        |       |              |       |              |       |          |       |                         | •    |       |

Parameter A

Table 2-11. LSI Requirements Summary

| Distributed 300 386 Example 2 Example 3 Strokessor 330 334 366 Example 3 Exa | 5-year Mission                | Number of LS1 per system<br>5-year Mission | N         | mber of LSI per s<br>10-year Mission | Number of LSI per system<br>10-year Mission |           |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------|--------------------------------------------|-----------|--------------------------------------|---------------------------------------------|-----------|
| 324 366                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | Example 2 Example 3 Example 4 | Example 4                                  | Example 1 | Example 1 Example 2 Example 3        | Example 3                                   | Example 4 |
| 324 366                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 386 547                       | 929                                        | 335       | 338                                  | 582                                         | 1032      |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 366 596                       | 904                                        | 356       | 408                                  | 628                                         | 1000      |
| Externally Redundant 369 492 10                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 492 1064                      | 2574                                       | 492       | 820                                  | 1883                                        | 7451      |
| nt 210 261                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 261 362                       | 260                                        | 210       | 261                                  | 398                                         | 631       |



Figure 2-15. Comparison of Processor Configurations (Optimized for  $P_s$  = 0.95 for 10 year Mission)



Figure 2-16. Comparison of Processor Configurations (Optimized for  $P_s = 0.95$  for 5 year Mission)

The two graphs summarize the results of the comparison of the four approaches to solve the particular digital communications processing requirements discussed in this Section. The discontinuities in the graphs at the low levels of the graph reflect quantization effects. The addition of a processor to a system has a greater effect for Example 1 and 2 configurations than for Example 3 and 4 configurations.

#### SECTION 3

#### HARDWARE LIMITATIONS ON SATELLITE PROCESSOR APPLICATIONS

This section addresses the technological limitations imposed by the subsystems and devices which are the building blocks of the on-board signal processor. These limitations will constrain spreading bandwidths, chip rates, processing gains and baseband rates feasible in analog subsystems; and the baseband processing rates and information throughput of real time digital subsystems. In the later case, limitations will result from the architectural considerations discussed in Section 2 and from the intrinsic speed, gate and I/O density, memory size, reliability and hardening level, achievable with the projected state-of-the-art of digital hardware in the time frame of interest.

#### 3.1 Analog Signal Processing Hardware Limitations

As previously discussed, the Front End Signal Processor interfaces with the satellite RF subsystems, performing analog signal processing functions such as AJ despreading and A/D conversion, and in some instances, demodulation, IF filtering and remodulation. The implementation of these processes will result in system limitations on the feasible AJ processing gain, chip rates and baseband rates for on board satellite processors.

### 3.1.1 Processing of Frequency Hopped Signals

On board processing of frequency hopped signals results in less hardware limitations than other spread spectrum modulation methods, as far as synchronization requirements and tolerance to fine grain imperfections over the transmission bandwidth of the communication channel. The later including both the equipment transmission characteristics and the transmission media coherence.

It is therefore possible to have frequency hopped systems with very large spreading bandwidths, of course assuming frequency allocation availability. At mmWaves spreading bandwidths of 2 GHz have been demonstrated and there is no technical limitation for even larger. The passband amplitude response of the

equipment must be reasonably flat, to avoid despreading losses for certain chip sequences, however, the phase characteristics should be controlled only over the chip emission bandwidth, which is a small fraction of the hopping bandwidth. These requirements are relatively easily met at the uplink Satellite Processor dehopper, however, amplitude-frequency response is a consideration in the ground terminal transmitter design. On the other hand, modulation parameters such as processing gain, chip rate, etc., may be limited by the Processor dehopper, in particular by the frequency synthesizer design.

The main characteristics which define the performance of a frequency hopping synthesizer are:

- Resolution
- Settling time
- Spectral purity
- Hopping bandwidth
- Power consumption, volume and weight

The resolution of the synthesizer will determine the minimum size of a frequency step, and therefore the number of available channels, and will be limited by the long and short term stability of the frequency standard from which the output frequency is synthesized. The long term stability or drift of the satellite source may be compensated for at the ground terminal, or by some form of tracking filter at the on board demodulator (see ER78-4276 Final Report to Task No. 1, Section 3). However, short term instabilities of duration shorter than the tracking loop bandwidth may result in degradations of the achievable processing gain. This subject, which is related to the spectral characteristics of the synthesizer, is discussed below.

The settling time, refers to the transient phenomena in a frequency hop, and will limit the usable chip rate, in the sense that the synthesizer settling time should be a small fraction of the chip duration, to minimize the processing degradations. The settling time will depend on the method of synthesis and device technology. Using mix and divide direct synthesis methods, current technology

achieves settling times of 1.5 µsec., with a resolution of the order of kHz. Using the same techniques, and GHz logic, it is estimated more than one order of magnitude in settling time improvement for the time frame considered in this Study, although at the expense of resolution, power consumption and spurious level. Then, for a reasonable processing loss (0.5 dB), maximum chip rates would be the order of 1 to 5 Mchip/sec, unless radically new methods of frequency synthesis are invented in the future.

The spectral purity of the synthesizer output signal imposes limitations on the processing gain that may be achieved in a frequency hopped system. In band spurs will translate interference out of the band of the chip emission, into the post dehopping filter passband. Unfortunately, direct synthesis methods required for fast hopping result in large spurious levels, typically of the order of 60 dB. The level of spurs then will be a function of the synthesis scheme, device technology and the actual hardware design.

Another limitation results from the spectral purity of the synthesizer output, equivalent to the short term stability in the time domain characterization, and which may produce a degradation for low data rates. This degradation results from the broadening of the carrier spectra, such that some of the received energy may fall outside the bassband of the filter following the dehopper. The close in spectral purity or short term stability, may be improved by cleaning phase lock loops, but ultimately cannot be, fractionally, better than the one of the standard reference. High quality crystal oscillators are used as spaceborne frequency sources, although flight qualified atomic standards will probably be available for the time frame of interest. The close in spectral purity of the former are better, in fact crystal oscillators are used in cleaning loops of atomic standards, and no dramatic improvement of this technology is foreseen in the future. The close in spectral broadening will be proportional to the square of the frequency of use and will impose lower bounds of the order of 50 chips or symbols per second at mmwayes.

The degradations resulting from spectral purity and frequency stability are illustrated by considering the case of an uplink signal using 8'ary FSK modulation frequency hopped at 200 chips/sec and 4-fold chip diversity, non-coherently detected

with an 75 kHz equivalent noise bandwidth matched filter bank. Figure 3-1 shows the Es/No degradation for a given symbol error probability, where the short term instability is given by the rms frequency fluctuation averaged over a chip duration and the long term instability is the uncompensated frequency bias over many chips. For evaluation of a practical case, we will consider the LES 8/9 satellite frequency source having a short term (over a ship duration) rms instability of 11.4 Hz when translated to Ka band (36-38 GHz), and a long term drift of 19 Hz/day. The resultant Es/No degradation due to the processor alone, would range from negligible to 0.2 dB depending if the long term drift is compensated or not.

Also, it should be pointed out that for an actual systems, degradations due to uncompensated spacecraft motion and ground terminal platform motion generated Doppler, equipment frequency compensation errors, and complexity and cost factors, may actually dictate the chip rates limits rather than the Satellite Processor implementations.

The required hopping bandwidth also determines the method of frequency synthesis to be used. Very large hopping bandwidths may require changing several LO's frequencies rather than only one, as illustrated by a dehopping synthesizer shown in Figure 3-2, and capable of covering the 36-38 GHz band. Direct synthesis is used to achieve a settling time of less than 1.5 µsec, edge to edge of the 2 GHz bandwidth. A discrete number of switchable receiver LO frequencies translate the received signal into the same frequency band for fine frequency dehopping by a mix and divide type synthesizer.

The dehopper power consumption, volume and weight may be minimized by the choice of components and construction techniques. Thick film hybrid ceramic substrate technology of modular approach offers the best possibilities for size reduction and reliability enhancement. Figure 3-3 shows the hybrid microelectronics version of the fine hopping synchesizer of the block diagram of Figure 3-2, as contrasted with the discrete version for a ground terminal application. The synthesizer is built from 6 modules each having two substrates which include a 4 PDT switch, frequency selection circuitry, mixer, divider and amplifier, each having 250 mW dissipation and a volume of 1.75 cu. in. The modules may be built from radiation hardened devices and may be hermetically sealed.



Figure 3-1. Degradation Due to Frequency Offset and Instability Pe/Symbol  $\approx 4.10^{-2}$  8 ARY MFSK



Figure 3-2. Fine Hopping Synthesizer Block Diagram



Figure 3-3. Fine Hopping Synthesizer Ground and Space Versions

The above discussion has stressed the use of direct frequency synthesis to achieve high hopping rates. However, for moderately low rates requirements, indirect synthesizers offer cost and weight advantages. The settling time will be of the order of tens of microseconds, resulting in useful hopping rates of a few kilohops. Very high resolution may be obtained by the hybrid technique of mixing and divide several indirect synthesizer loops and in this manner reducing the countdown delay. Using this approach and an architecture similar to the direct synthesis of figure 3-2, the same number of frequency steps and hopping bandwidth may be obtained for lower hopping rates.

#### 3.1.2 Processing of Direct Sequence Signals

As above mentioned, Direct Sequence Spread Spectrum signals may suffer degradation due to imperfections of the fine grain characteristics of the transmission channel. These imperfections will result from lack of coherence of the propagation media and, phase non-linearities and signal filtering effects in the ground terminal and processing satellite.

The lack of coherence of the propagation media sets an ultimate limit on the direct sequence spreading bandwidth and may result from differential amplitude, such as in the special case of operation at mmWave absorption bands, and from differential phase delays introduced by the passage through the ionosphere.

The group delay variation  $\Delta \tau$  across the band  $\Delta f$  due to the refractive index of the ionosphere is given by

$$\Delta \tau = \frac{\Delta f}{cf^3} \int f_p^2 ds \cong \frac{\Delta f}{cf^3} < f_p > R$$

where  $<f_p>$  is the average plasma frequency of the free electrons in an ionosphere of slant range R. As might be expected  $<f_p>$  R is highly variable in diurnal, seasonal and sunspot cycles, and also having occasional abnormal behaviour. At 400 MHz, the product ( $\Delta f \Delta \tau$ ) varies from 0.4 to 2 for a 10 MHz band. As discussed below, the degradation will be negligible for biphase spreading rates of the order of 7 Mchip/s or less at 400 MHz, 3 Mchips/s at 225 MHz and for the full SHF and EHF satellite communications band allocations.

Also, the use of phase array nulling antennas at UHF may result in that the maximum chip rate is restricted by the instantaneous bandwidth of the nulling array.

The degradation in correlation gain for a signal band-limited by a physically realizable network which is correlated with a local reference, may be obtained from the expression:

$$R (\alpha, \phi) = Re \left\{ \frac{1}{2\pi} \int_{-2}^{\infty} |A(j\omega)|^2 Y (j\omega) \epsilon^{j(\omega\alpha+\phi)} d\omega \right\}$$

where R ( $\alpha$ ,  $\phi$ ) correlation between receive signal and local reference having time error  $\alpha$  and phase difference  $\phi$ 

A  $(j\omega)$  is the Fourier Transform of the PN modulation

Y  $(j\omega)$  filter transfer function referred to center frequency of PN signal

The correlation loss for a biphase PN spreading due to parabolic phase distortion (linear component of group delay) of the filter network and due to the bandwidth restriction of the filter is given in Figure 3-4, as a function of the filter bandwidth and the parabolic phase distortion of the network at the first null of the PN signal spectra. It is seen that the degradation becomes significant for filter bandwidths of less than the main spectral energy lobe and for parabolic distortion at the first spectral null exceeding one radian or linear group delay of more than 1 chip duration. For a channelized transponder these filtering limitations will result in chip rates lower than the transponder bandwidth for avoiding degradations in the despreading process.

Another limitation on direct sequence modulation chip rate will result from the capability to achieve synchronization. As the chip rate increases, and being constrained by propagation lag, the acquisition time for a sliding correlator will be excessively long. For instance, for a 100 Mchip/sec rate and 100 km satellite range uncertainty, the acquisition time will be more than an hour.



Figure 3-4. Loss Due to Parabolic Phase Distortion and Finite Bandwidth

shorter acquisition times may be achieved by measuring the time of arrival at the spacecraft, compare to satellite timing, and transmit the differential via the downlink, or alternatively adjust the spacecraft PN generator timing accordingly. In any case, a programmable matched filter correlator will have to be implemented using SAW or CCD or digital technologies. Current state-of-the-art SAW convolvers would allow bandwidths up to 50 to 100 MHz at a VHF center frequency, with an interaction time of the order to 20 to 100 µsec. Using CCD the correlation length will be extended to the order of msec, however, bandwidth is limited to about 10 MHz. Digital programmable correlators also allow for very long correlation lengths and the chip rate will be limited by the A/D converter capability and by the correlator itself, being the later the dominant factor. The characteristics of a current state-of-the-art programmable digital correlator, CMOS on saphire LSI chip is given in Table 3-1, where by cascading of several chips, almost unlimited sequence lengths may be practical.

Table 3-1. LSI Digital Correlator Chip Characteristics

• Programmable Length : 1 to 32 bits

• Package Configuration : 18-Pin DIP

• Correlation Output : Analog voltage output and binary

coded digital output

• Chip Size : 180 mil x 180 mil

Output Buffer (Two TTL Loads)

 $Log 0, V_{max} = 0.4 V, I = -3.2 ma$ 

Log 1,  $V_{min} = 2.4 \text{ V}$ , I - 80  $\mu\text{A}$ 

• Input Buffer (TTL), Logic 0 Input

Voltage, V<sub>min</sub> = 2.4 V

• Analog Correlation Rate : 20 MHz

Digital Correlation Rate : 5 MHz

• Operating Temperature : -55°C to 125°C

• Power Dissipation : 40 mW

Of all these technologies, digital correlators may offer the most promising improvement in chip rate and processing gain capabilities. More than one order of magnitude speed improvement of LSI integration may be expected for the time frame of this study, with the use of subnanosecond logic, resulting in chip processing speeds approaching 300 Mb/s., with large correlation gains.

#### 3.1.3 IF and Baseband Processing

Analog IF and baseband processing that may be performed in a Processing Satellite include IF filtering, amplfication and AGC, modulation and demodulation and A/D conversion for performing subsequent digital signal processing.

#### 3.1.3.1 A/D Converters

IF and baseband digital signal processing are required to convert the analog signal into hard or soft quantized digital samples. The extent of real time digital signal processing possible is a function of the sampling rate and number of quantizing levels in the A/D conversion of the analog signal.

At present there is a definite need for higher-speed A/D converters for use with high-speed signal processors. Existing state-of-the-art and a projection to 1980 is shown in Figure 3-5. Dielectrically isolated ECL technology is being pursued at a number of organizations to satisfy this need. For example, TRW is pursing oxide-aligned-transistor (OAT) techniques and Lincoln Laboratory has developed a "poly-ox" isolation scheme which allows deep isolation and better packing density than oxide-isolation schemes.

The radiation hardness of A/D converters will be a special problem because of the sensitivity of the analog circuitry to ionizing dosage. However, ECL technology has inherent radiation hardness because of the high current levels at which it is operated. In the future, it is likely that GaAs FETs will be utilized in A/D converter technology. This material has inherent radiation hardness and will also allow an order to magnitude reduction in power consumption over ECL.

### 3.1.3.2 Analog Modulation and Demodulation

Analog demodulation has been extensively implemented on board spacecraft systems and is an implementation alternative to the digital demodulation schemes discussed in Task 1 and in Section 2 of this report. The considerations on



Figure 3-5. Monolithic A/D Converter Technology

selecting one approach versus another may be: reliability, modulation rate and if further digital baseband processing is to be performed on the demodulated signal.

For analog hardware, the desired reliability over the mission life has to be achieved through redundancy at the subsystem level rather than at the module level as in the architectures discussed in Section 2. For equal operational requirements, and for the time frame of interest, the digital implementation will result in power, size and weight savings, however, it will be limited in the maximum bit rate to be demodulated. Analog DPSK modulation and demodulation at Gb/s rate has been demonstrated and seems feasible for on board implementation.

## 3.2 Digital Signal Processing Hardware Limitations

#### 3.2.1 State-of-the-Art Device Technologies for 1985

### 3.2.1.1 Introduction

Over the past several years, semiconductor technologies have made significant advances directed toward producing dense and extremely complex integrated circuits. This has been achieved through improved lithography, and by utilizing a number of new process technologies as well as circuit design concepts. The applications for these expanding semiconductor technologies have been segmented into two main areas: Logic and Memory.

In the logic area there are two major directions being pursued. These consist of very large scale integration (VLSI), medium speed (1-10 nanosecond) devices, and very high speed (<1 nanosecond) devices with medium-to-large scale integration (MSI to LSI) potential. This exists to some degree in the memory area, but the greatest emphasis here is on low power VLSI.

In the following sections the present and future potential of these areas will be explored in some detail in order to indicate what state-of-the-art technologies will be available by 1985. In Table 3-2 are indicated the various technologies being used or explored at present for memory and logic applications. Through 1985 it is expected that some new circuit technologies may emerge and be superior to those already listed. In fact, there appears to be an accelerated

|                  | LOGIC           | MEMOR            |           |  |  |
|------------------|-----------------|------------------|-----------|--|--|
|                  |                 |                  | T         |  |  |
| BIPOLAR          | FET             | BIPOLAR          | FET       |  |  |
| TTL              | NMOS            | TTL              | NMOS      |  |  |
| 1 <sup>2</sup> L | PMOS            | 1 <sup>2</sup> L | PMOS      |  |  |
| ECL              | CMOS            | ECL              | CMOS      |  |  |
| EFL              | CMOS/SOS        |                  | CMOS/SOS  |  |  |
| E <sup>2</sup> L | DMOS            |                  | DMOS      |  |  |
| NTL              | VMOS            |                  | VMOS      |  |  |
|                  | Si MESFET       |                  | Si MESFET |  |  |
|                  | GaAs MESFET     |                  | CCD       |  |  |
|                  | CCD             |                  | FAMOS     |  |  |
|                  |                 |                  | MNOS      |  |  |
|                  | OTHER TECH      | NOLOGIES         |           |  |  |
|                  | JOSEPHSON JUNCT | ION              |           |  |  |

effort underway directed toward improving the performance of I.C. technolgies in terms of density, power dissipation, and speed. The initial thrust is being directed toward the development of high density lithographic techniques (E-Beam/X-Ray). It is expected that improvements will be made, not only in density, but also in speed and power through a complete downward scaling of all device features.

In order to indicate, in some detail, the benefits expected from the full exploitation of device scaling the next section will review its impact on existing MOS type device technologies.

## 3.2.1.2 The Impact of Dimensional Scaling on VLSI

It has been predicted and partially demonstrated by a number of device technologists (References 1, 2 & 3) that tremendous gains in integrated circuit complexity can be achieved through the effective downward scaling of device element sizes and the appropriate adjustment of process parameters. Through this process of scaling, predictions are being made indicating the future evolution of Super VLSI devices; e.g., chips with hundreds of thousands of gates for random logic and/or millions of bits of memory. Along with these projections for very large device densities it is also predicted that these will be achieved with an increased speed of operation at little to no increase in total chip power dissipation. These projections are primarily based on silicon MOS type technologies, exclusive of developments in circuit design and developments using other materials technologies. Device scaling will have an impact on silicon bipolar technologies, although not as dramatic as that for silicon MOS structures.

As a guide to making density growth predictions for the various MOS type device technologies, charts similar to the one shown in Table 3-3 have been developed. This chart also indicates some of the performance limiting factors along with those areas that will place increasing demands on process technologies. Before discussing the impact of any such limitations it would be first very informative to explore the potential effect of dimensional scaling on one device technology for which scaling, rather than circuit innovation, will have the greatest effect on cell density. Such a technology is represented by charge coupled device digital memories.

Table 3-3. "MOS" Circuit Performance vs Device Scaling

| SCALING FACTOR           | 1/n                       | -                      | 1/1       | 1/n       | 1/1              | 1/1 / 3                      | 1/n2 \\ \( \bar{D} \) \( \bar{D} \) \( \bar{D} \) | 1                  | L <sub>2</sub>                       | u <sub>2</sub>                | 1                     | ٠. ٢                     |                  | n <sup>2</sup>    | 1 +        |
|--------------------------|---------------------------|------------------------|-----------|-----------|------------------|------------------------------|---------------------------------------------------|--------------------|--------------------------------------|-------------------------------|-----------------------|--------------------------|------------------|-------------------|------------|
| DEVICE/CIRCUIT PARAMETER | DEVICE DIMENSION to, L, W | DOPING CONCENTRATION N | VOLTAGE V | CURRENT I | CAPACITANCE &A/t | DELAY TIME/CKT. VC/I OR 1/fc | POWER DISSIPATION/CKT. VI OR CV <sup>2</sup> f    | POWER DENSITY VI/A | LINE RESISTANCE, $R_{L} = \rho L/Wt$ | NORMALIZED VOLTAGE DROP IR /V | LINE RESPONSE TIME RC | LINE CURRENT DENSITY I/A | RESULTS FOR VISI | INTEGRATION LEVEL | POWER/CHIP |
|                          |                           |                        |           | CE        | DEAT             |                              |                                                   |                    | SIC                                  | ONNE                          | текс                  | INI                      |                  |                   |            |

A basic CCD digital memory cell consists of four elemental sections as shown on the following page.

This basic cell represents the minimum complexity for any clocking arrangement, whether it be for 1, 2, or 4 phase operation. Digital CCD memories are relatively efficient, in that the active cells will occupy approximately 50 to 60 percent of the available chip surface area.



From the basic cell layout the cell area is given by,

$$C.A. = 8L^2$$

The total number of memory bits on a chip can be represented by the following equation:

$$N = \frac{\text{Chip Area}}{C.A.} \times \text{Utilization}$$

It is obvious from this simple equation that any of the three variables can impact the number of memory bits on a chip. However, if a direct effort is made to reduce the cell area considerably, then it would be impractical to simultaneously increase the chip area significantly, if at all. If chip area does not change, then it can be argued that the % utilization factor should not change appreciably for a given memory architecture. Of course, a change in architecture could affect the % utilization. For the present it would be appropriate to consider a chip whose dimensions are approximately 5 mm by 5 mm (chip area = 25 mm<sup>2</sup>) with a memory cell utilization factor of 50%.

As a starting point it would be appropriate to use a mask linewidth of \$\mu\$ as representative of current state-of-the-art lithography. This does not imply that VLSI chip designs do not exist with some small percentage of narrower linewidths (1 to 2 \mum), but that 4 \mum is the maximum dimensions (linewidth or spacing) for 80 to 90% of the chip features (exclusive of bonding pads). From these considerations the projections shown in the following table can be postulated.

| Cell Area           | Maximum Number * of Memory Cells                         |
|---------------------|----------------------------------------------------------|
| 128 µm <sup>2</sup> | 99 к                                                     |
| 32 µm <sup>2</sup>  | 392 к                                                    |
| 8 μm <sup>2</sup>   | ∿1.5 Meg.                                                |
| 2 μm <sup>2</sup>   | ∿ 6 <b>Meg.</b>                                          |
|                     | 128 μm <sup>2</sup> 32 μm <sup>2</sup> 8 μm <sup>2</sup> |

\* Chip Area = 25 mm<sup>2</sup>; % Utilization = 50

This table indicates that if scaling can be thoroughly applied to all the device and circuit parameters, tremendous gains in memory density can be achieved. It is important to note that the column for <a href="linewidths">linewidths</a> represent <a href="maximums">maximums</a>.

In order to compare some elements of this table to a real baseline it would be instructive to examine some of the most recently available CCD memories. One 64K-bit CCD memory chip that has been developed using the equivalent of 6  $\mu$ m geometry resulted in a cell size of 256  $\mu$ m. However, the resulting chip area was 33 mm<sup>2</sup>. In fact, if the equation for memory density presented earlier is applied here, the result is

$$N = \frac{33 \times 10^6 \mu m^2}{256 \text{ um}^2} \times 0.5 = 64 \text{K-bits}$$

This, at least, provides for some credibility to the projection methodology.

Another chip design has achieved a density of 131-K bit of memory using similar design rules as those for the 64-K bit design. However, the factor of two density increase was achieved by allowing each memory cell to carry the equivalent

of two digital bits of information. This capitalizes on the analog character of CCD storage wells. In so doing a penalty is paid with respect to latency time. This can affect either maximum register length or minimum frequency of operation - within a given set of process/design parameters.

For both designs the thinnest gate oxide regions are typically  $800\text{\AA}$ . If these designs were scaled to 4  $\mu\text{m}$ , then the gate oxide thickness would have to be reduced to approximately  $600\text{\AA}$ . Such a thickness is achievable using current process technology for VLSI chip designs.

If we consider that 600Å would be an appropriate gate oxide thickness for 4 µm geometry, then as scaling rules are applied, the oxide thickness must be reduced accordingly. The results of scaling are shown in Table 3-4.

Table 3-4. Effects of Scaling on Circuit/Device Parameters

| Linewidth                     | 4 μm   | 2 μm     | 1 μm     | 0.5 µm      |
|-------------------------------|--------|----------|----------|-------------|
| Gate Oxide                    | 600 8  | зоо Х    | 150 X    | 75 <b>X</b> |
| Maximum Voltage               | 8-10 V | 4-5 V    | 2 V      | 1 V         |
| Delay Time/Stage              | 0.5 ns | 0.24 ns  | 0.1 ns   | 0.05 ns     |
| Power Dissipation/<br>Circuit | 0.1 mw | 0.025 mw | 0.006 mw | 0.002 mw    |
| Line Response Time            | 0.1 ns | 0.1 ns   | 0.1 ns   | 0.1 ns      |

In this table selected critical parameters have been chosen for illustration. Note that for gate oxides the appropriate thickness for a 0.5 µm linewidth is 75Å. This represents a very ambitious goal if one considers the fact that over the last 5 to 6 years gate oxides have only been reduced by at most a factor of 2 as linewidths have been reduced by a factor of 4 to 5. Again, this does not imply that very thin oxides have not been grown or deposited in the fabrication of novel thin device structures, but that achieving this for super VISI devices may not be easily achieved. It is anticipated the 0.5 µm linewidth resolution will be achieved, reproducibly, before a thin oxide process has matured sufficiently for inclusion in VISI designs. In fact, MESFET gate structures will probably benefit the most from reduced linewidths. This has already been demonstrated.

The second item in Table 3-4 that has been scaled is the maximum gate, or supply, voltage. As can be seen, this must be reduced to approximately one volt in order to take full advantage of increased performance while maintaining internal fields and space charge layers in balance, as well as contribute to a reduction in the power dissipation per cell structure. A reduction in power dissipation per cell is necessary in order to increase the functional density on chip. This may pose some problems relative to noise immunity (as does a shrinking cell size).

The next item in Table 3-4, relating to the delay time per stage, can be interpreted in a number of ways. It could represent the required transfer time in a CCD cell to guarantee a 0.999 element transfer efficiency so that a 256 bit linear array can be effectively operated at 10 MHz, or the internal delay of a CMOS/SOS or NMOS logic gate. The value of 0.5 ns for this parameter for structures fabricated using 4 µm linewidths has been demonstrated. It is interesting to note that the delay per stage will be reduced to only 50 ps if full scaling can be applied with 0.5 µm linewidths. If gate oxides cannot be scaled to 75A for this linewidth, but are limited to 150%, then other modifications would have to be included. For example, the voltage scaling would stop at that level appropriate for a 150X gate oxide. With an increase in supply voltage the current would increase (not scaled to the 0.5 µm required level). It may then be carrier velocity limited. Attention will then have to be given to the channel doping level, which would have normally been scaled linearly so that punch through (or short channel) effects would not have been experienced. If the voltage, channel width, and doping level are not scaled together, then for a reduction in the channel length while maintaining a non-reduced supply voltage the doping level in the channel will have to be increased by an amount greater than the scaling factor applied to its length. This is required to avoid punch through or surface barrier lowering effects. But this may reduce the junction breakdown voltage or/and increase the junction parasitic capacitance. Two of the references (1, 2) cited earlier, considered this case and have proposed two-level channel doping profiles. From an examination of the scaling rules, excluding doping levels, the net effect of not scaling the oxide thickness when scaling the linear dimensions from 1 µm to 0.5 µm is that the time delay per circuit would scale by almost  $1/\eta^2$ , but the power dissipation per circuit would increase slightly. Therefore, although the power-delay product

scales to  $1/\eta^2$ , the total chip power dissipation would increase as an attempt were made to increase the chip circuit density. A significant increase in density could not be achieved.

Another option would have consisted of scaling the voltage along with the lateral dimensions while maintaining the oxide thickness constant. This would result in a scaling of the power dissipation per circuit approaching  $1/\eta^3$  and a time delay scaled to  $1/\eta$ . However, this would not be very practical since the threshold voltages would not be in balance. Modifications to the channel doping profile would have to be included so that the threshold voltages could be scaled with the supply voltages. As a result of this, it is anticipated that complete supply voltage scaling would not be achieved. Thus, the power dissipation per circuit would be no less than that corresponding to complete scaling. The obvious advantage to this approach is that it is possible to configure a scaled structure in which the oxide vertical dimensions do not have to be extremely thin in order to achieve a low power dissipation.

The next item highlighted in Table 3-4 considers the effect on power dissipation per circuit as a function of scaling. As a base it can be assumed that for low power technologies, such as CMOS/SOS, the power dissipation per logic function using 4 µm linewidth lithography is approximately 0.1 mw. If scaling is completely effective, then this will approach 2 µw per logic function at the 0.5 µm linewidth level. Parasitic power contributions may limit this ultimately.

This last item specifically addressed is that of line delay. From the scaling rules shown in Table 3-3, it can be seen that this circuit parameter does not scale. It can be reasonably estimated that from a 4 µm linewidth circuit the line delay is approximately 20% of the circuit delay (or 0.1 ns). Thus, as smaller linewidths are achieved the line delay can exceed the circuit delay. However, if the conductivity of the interconnects is increased through the use of different materials and processes this element may be reduced.

One factor, not specifically highlighted, but of concern, is the increase in line current density that will result. This will affect those technologies that already have high densities, such as most of the bipolar structures. For many silicon based MOS technologies the increase in current density will most likely be tolerable.

From the discussions above, it can be seen that dimensional scaling will have a considerable impact on the growth of integrated circuit densities in the near future. This will be achieved only if the resources are committed to the development of high density lithography techniques, such as E-beam and X-ray, coupled with advances in materials and processing. In addition to the fabrication technology development a similar emphasis must be given to the design, test, and packaging of integrated systems that will take full advantage of these highly dense complex devices. In the next section we will discuss some of the advances that must be made in process technology in order to support very small device structures.

# 3.2.1.3 Semiconductor Process Technology

As the density and complexity of silicon integrated circuit chips continue in an expanding mode, the requirement for very large wafers of a perfection unknown ten or even five years ago becomes a very large factor. This, together with advances in process technology, are what is permitting the growth that is being realized. LSI manufacturers are continually revising their silicon wafer requirements as chip size and complexity increase. An example of the evolution of some of the basic crystal and wafer parameters is shown in Table 3-5.

Table 3-5. Silicon Specifications

|                                         | 1960    | 1965   | 1970  | 1975    | 1985 (est) |
|-----------------------------------------|---------|--------|-------|---------|------------|
| Wafer Diameter (mm)                     | 12-25   | 25-50  | 50-75 | 75-100  | 100-150    |
| Thickness (mm)                          | 0.1-0.2 | 0.25   | 0.4   | 0.4-0.6 | 0.6-0.8    |
| Resistivity Radial Gradient             | 30%     | 20%    | 15%   | 12-15%  | 5-12%      |
| Dislocation Density (cm <sup>-2</sup> ) | 50,000  | 10,000 | 1,000 | 100-500 | 10-50      |
| Surface Finish                          | 1/2μ    | 1/4μ   | 0     | 0       | 0          |
| Crystal Weight                          | 200 g.  | 1 kg.  | 4 kg. | 12 kg.  | 20 kg.     |

It is obvious that as the chip sizes and complexity of circuits increase, any remaining surface imperfections on wafers have an increasingly severe impact on yield. This places an increased demand on improved techniques for wafer slicing and subsequent lapping and polishing. The use of non-contact printing techniques which has permitted production line resolutions of 2  $\mu$ m requires wafer flatness

and thickness uniformity much better than that in conventional contact printing. As line resolution moves to  $0.5~\mu m$ , material quality becomes even more stringent.

During processing the presence of stacking faults induced during the oxidation cycles is known to contribute to reduced yield. This has prompted a great amount of concern in the development of effective processing procedures, such as the use of dry  $0_2$  prior to a steam oxidation cycle and the use of HCl in-situ cleaning of oxidation tubes. New oxidation techniques have been explored through the use of high pressure-lowtemperature environments that have resulted in very low, surface state densities ( $<10^{10}$  (cm) $^{-2}$ ).

As example of the growth in chip sizes over the past few years with an extrapolation to the future is shown in Figure 3-6. In this figure a "band" of chip areas is shown. The direction taken by most semiconductor manufacturers has been to introduce devices with increased complexity using process technologies that are currently available. This usually results in a chip area of considerable size. As process technologies improve in a direction to reduce device geometries the corresponding chip areas are reduced. This has the very positive effect of increasing a given device yield as well as provide for the impetus to further increase device complexity. In general, most manufacturers are restricting production chip sizes to below 40,000 mil<sup>2</sup>. With the most recent initiative, it is expected that chip sizes will grow at an expanded rate. Chips on the order of 160,000 mil<sup>2</sup> are expected by 1985. At this time 80,000 mil<sup>2</sup> CCD memory chips are being fabricated on a prototype basis. Reprogrammable read-only memories of the MNOS variety (avery complex process) are being fabricated on chips with areas in excess of 60,000 mil<sup>2</sup>.

The process technologies that have had a singificant impact on the high level of achievable complexities in integrated circuits consist of polysilicon for multilevel, self-aligned gate structures in MOS devices, local oxidation techniques for field isolation, and ion implantation for both MOS and bipolar impurity doping. These innovations, together with improved photolithography, have been responsible for the development of 64K bit MOS memory chips and 16K bit I<sup>2</sup>L bipolar memory chips.



\* AREA REDUCTION DUE TO PROCESS INNOVATION

Figure 3-6. Growth in Chip Area

In the area of photolithography the progression from contact to non-contact, projection techniques have made significant improvements in both linewidth resolution and yield. With projection printing using electron-beam generated masks, feature sizes of between one to two µm will become possible on a production basis during the early 1980 time frame. Currently, fine line projection printing is accomplished using very controlled production techniques. The typical limit today is in the range of 2 to 4 µm. An optical resolution limit is predicted close to 1 µm. In the future, E-beam or X-ray printing techniques show promise of being used for 0.5 µm feature sizes. E-beam photolithography presently requires a capital investment of greater than \$1M and is slow. X-ray lithography, an exploratory, prohibitively expensive, and very slow approach at present, but is not susceptible to dust and other contamination-dependent defects.

With this as a background let us now discuss the potential for semiconductor chip technologies in the areas of logic and memory.

## 3.2.1.4 Semiconductor Logic Technologies

As mentioned above, semiconductor logic technologies generally fall into two classes; medium speed VISI and high speed MSI to ISI. The VISI, medium speed logic class is characteriszed by low power dissipation per gate (0.05 mw to 1 mw) and propagation delay times in the range of 1 to 10 nanoseconds per gate. This logic class is also characterized by both high input and output impedance levels. Because of the low power dissipation and the low to medium element count per gate, the integration level today generally ranges from 500 to 5,000 gates per chip.

The very high speed, MSI/ISI logic class is characterized by a relatively high power dissipation per gate (5 mw to 100 mw) and propagation delay times less than one nanosecond. In most cases this logic class must operate in a low impedance transmission line environment and, therefore, must possess a low output impedance (5 to 20 ohms). With this output drive condition the only way to minimize power dissipation is to restrict the logic level excursions. In addition to the high power dissipation these logic gates are, in general, relatively complex with element counts ranging as high as 10 per gate. These factors combine to restrict present integration levels to below 1000 gates per chip.

Historically, commercial computers in the high-performance range always utilized the circuit family with the highest available switching speed. In so doing, heavy penalties were incurred relative to power dissipation, packaging complexity and parts cost. The reason for this choice is the overall simplification of software architecture achieved by maximizing serial processing.

The goals for most military applications of semiconductor technology generally consists of achieving a balance between high performance, high reliability and moderate cost. The overall advances in semiconductor logic technology within the past few years are thus permitting the military user to more effectively combine the myriad of available devices in achieving these goals. The requirements for reliability and graceful degradation in military systems favor multiprocessor systems of various architectural configurations. This forces a certain level of parallel processing and reduces the required performance level of individual arithmetic units. In many areas VLSI, with speeds in the range of 1 to 10 nanoseconds, can well meet the demands for a significant amount of the signal processing requirements. Only in the most demanding signal processing applications where ultra-high speed arithmetic units are required will gigahertz logic be employed.

These considerations do not imply that very fast logic of relatively high power dissipation cannot provide the optimum solution to many applications, but it means that LSI and VLSI technologies of lower speed and much lower power will be competitive in systems with high throughput rates. To analyze the optimum trade-off between these two groups of circuit families, their characteristics will now be discussed in some detail.

#### 3.2.1.4.1 Characteristics of LSI/VLSI Circuit Families

The two technologies most widely used at present in military systems are Schottky TTL and bulk CMOS. Schottky TTL offers the advantages of relatively high speed, good noise immunity, and excellent current sinking capability which facilitates the use of integrated drivers. Unfortunately, it has a relatively high power dissipation and a large component count per gate (=10) which limits the economic integration level of STTL to about 400 gates per chip. Bulk CMOS dissipates less power, but is slower, of limited drive capability, and not particularly dense either. It does, however, have a better integration potential.

Since the greatest cost reduction and performance increase of digital systems can be achieved by going to the highest technologically feasible monolithic integration level, both technologies will be slowly supplanted and replaced by novel circuit families which can be better intergrated. This, in turn, means smaller signal swings, few components per gate and reduced parasitic capacitance. Figure 3-7 shows the diagrams of some contending circuit families, and Table 3-6 lists their main characteristics. Naturally, the shown numbers are typical only, and can vary considerably, depending on processing details. The delays include on-chip line delays.

As can be seen, I<sup>2</sup>L and SOS are especially attractive circuits. I<sup>2</sup>L can easily be combined on-chip with powerful bipolar drive circuits. SOS, on the other hand, combines low power dissipation with switching speeds of considerable magnitude, exceeding the one nanosecond barrier. Short channel bulk NMOS and NMOS on SOS would be close contenders, except for the fact that their higher power dissipation limits the achievable integration level despite the good layout density of these circuits.

For most bipolar manufacturers, I<sup>2</sup>L represents a most promising technology for many VLSI circuits. Logic circuits employing I<sup>2</sup>L can be readily fabricated to operate over the full military temperature range (-55 to +125°C). An I<sup>2</sup>L microcomputer chip set is being offered by one manufacturer that includes a 16-bit single chip microprocessor. The gates speeds of I<sup>2</sup>L circuits are expected to approach those of conventional TTL with MSI complexity.

One problem encountered in going to higher integration levels is the accommodation of an increasing number of chip to carrier interconnections. Chips in the future will have as many as 100 to 200 I/O pads. This will place additional demands on test equipment, which, at present does not exist.

#### 3.2.1.4.2 Characteristics of Gigahertz Logic Families

Compared to lower speed technologies, fewer high speed circuit families are evolving as serious contenders, and except for ECL type circuits, most are still in the exploratory or advanced development stage. The important high speed circuit families are shown in Figure 3-8 and Table 3-7 provides a summy of their characteristics. Actually, some of the VLSI circuit families discussed in the last section are candidates in this category.









DMOS LOGIC GATE (W/DEPLETION LOAD)
OR SIMESFET (ENH. MODE)

NOTE: DASHED LINE (---) MEANS CAN
BE TIED TO OTHER LIKE MODES
TO PERFORM LOGIC

Figure 3-7. Circuit Families Suitable for LSI/VISI Implementation

Table 3-6. Characteristics of VLSI Circuit Families

| CIRCUIT FAMILY   | пгх  | Line Resolution (µm) | Integration Level<br>(Gates/Chip) | Gate Delay<br>(ns) | Power/Gate (mw) |
|------------------|------|----------------------|-----------------------------------|--------------------|-----------------|
|                  | 1979 | 2-4                  | 5,000                             | 1                  | 0.02            |
| CMOS/SOS         | 1982 | 1-2                  | 20,000                            | 0.5                | 0.005           |
|                  | 1985 | 0.5-1                | 100,000                           | 0.25               | .002            |
|                  | 1979 | 2-4                  | 5,000                             | 1                  | 0.3             |
| NMOS             | 1982 | 1-2                  | 20,000                            | 0.5                |                 |
|                  | 1985 | 0.5-1                | 20,000                            | 0.2                | .02             |
|                  | 1979 | 2-4                  | 8,000                             | 10                 | .05             |
| 1 <sup>2</sup> L | 1982 | 1-2                  | 30,000                            | 7                  | .03             |
|                  | 1985 | 0.5-1                | 100,000                           | ιΩ                 | .01             |
|                  | 1979 | 2-4                  | 10,000                            | ∞.                 | .1              |
| Si MESFET        | 1982 | 1-2                  | 20,000                            | 4.                 | • 05            |
|                  | 1985 | 0.5-1                | 20,000                            | .2                 | .02             |
| ja,              |      |                      |                                   |                    |                 |

The state of the s

Same?

(Allegary)

Parasista I

(Name and

land the same

1

1

Spiritage Park



Figure 3-8. Circuit Families for Gigahertz Operation

The table again provides typical values of the various parameters. The gate delays shown, in particular, represent on-chip values of gates, including capacitive and travel delays as encountered in devices of significant integration level. Off chip drive power will be significantly higher. Some delays have been measured on bare gates with a minimum of parasitic loading have been reported in the literature to exhibit considerably shorter switching delays - in the order of 100 picoseconds for ECL and EFL, and between 20 and 40 picoseconds for GaAs MESFETs. Electron beam-masked MOS circuits have also demonstrated delays close to 100 picosecond, but this masking technology does not yet represent a high-yield production method for integrated circuits because of the submicron linewidths

Table 3-7. Characteristics of High Speed Circuit Families

|                | Tabi | .e 3-/. Characteristic | Table 3-7. Characteristics of High Speed Circuit Families | amııres            |                 |
|----------------|------|------------------------|-----------------------------------------------------------|--------------------|-----------------|
| CIRCUIT FAMILY |      | Line Resolution (µm)   | Integration Level<br>(Gates/Chip)                         | Gate Delay<br>(ns) | Power/Gate (mw) |
|                | 1979 | 2-4                    | 1,000                                                     | .2                 | .25             |
| GaAs           | 1982 | 1-2                    | 5,000                                                     | .1                 | .1              |
| MESFET         | 1985 | 0.5-1                  | 10,000                                                    | • 05               | \$0.            |
|                | 1979 | 2-4                    | 1,000                                                     | 5.                 | 2               |
| ECL/EFL        | 1982 | 1-2                    | 2,000                                                     | .3                 | 5.              |
|                | 1985 | 0.5-1                  | 20,000                                                    | .1                 | .2              |
|                |      |                        |                                                           |                    |                 |

and registration tolerances used. For the near term minimum gate delays will realistically approach 100-300 picoseconds. Such a delay permits 10 to 20 gate delays per clock cycle which suffices for implementing reasonably conventional architectures in a 500 MHz system.

As mentioned, some the VISI circuit families could possibly be considered in this logic class. However, their high output impedance would restrict the speed with which one could come off-chip.

This ten leaves for the present as the most serious contenders for ultra-high speed logic the bipolar silicon ECL family has FaAs MESFET circuits. In each case, a mix of sub families provides the optimum performance and integration level.

Differential ECL, a variant of ECL, is faster but is a more complex circuit. Ordinary ECL combines speed and interconnect simplicity with a relatively high power dissipation. EFL is lowest in power, but does not perform the invert function and is not a good off-chip driver. As a consequence, these circuit types are frequently mixed on-chip for optimum performance. Where layout simplicity is preferred as, for instance in gate arrays, regular ECL offers the best compromise.

In comparison to bipolar ECL on silicon, GaAs MESFET technology suffers from a less developed processing technology, the use of a compound semiconductor resulting in a more difficult control of material uniformity, and a high concentration of surface states which makes the use of bipolar transistors and IGFETs for the present impractical.

GaAs MESFETs can be implemented in two circuit technologies. The faster circuit utilizes separate gate and driver/inverters and requires a level shift network (usually a string of Schottky diodes) to adjust input and output levels. This is required due to the fact that MESFETs just as JFETs operate at larger signal excursions only in the depletion mode. Variations of the basic circuit shown in Figure 3-8 have been more recently developed giving a reasonably low power dissipation per gate as projected in Table 3-7.

The low power GaAs MESFET circuit not projected for high speed applications, is based on the use of deep depletion mode MESFETs which are off at zero gate bias and operates in effect as enhancement mode transistors. Here pinch off at zero bias is determined by the offset voltage of the Schottky diode, the channel doping level and the limited depth of the channel.

GaAs MESFET transistors by themselves switch extremely fast, in the order of 10 picoseconds. Due to the relatively high output impedance; however, the RC time constant of an actual gate extends the gate switching time into the range of the best ECL performance. This high output impedance represents also a problem in driving off-chip interconnection lines.

As a consequence, the ECL circuit family offers at present the best performance in a typical high-speed computer of signal processor environment with its many devices, interconnections and packaging levels. In applications where highest speed is required, but logic complexity is limited, high speed GaAs MESFETS with level shifters offer the best performance. Such applications are IF and RF memories, counters, sequencers and multiplex switches. GaAs normally off MESFETS offer neither advantage, but have an impressive integration potential and low power consumption at speeds which are not far off from the best ECL circuits. It is felt, however, that this low power MESFET technology requires further process advances and the development of a compatible low-impedance driver before its potential can be fully exploited.

GaAs MESFET transistors have also been utilized in conjunction with TED devices to perform high-speed logic functions in the gigahertz region. At present, such circuits seem less developed than pure MESFET circuits with level-shifting diodes. All major packaging considerations would in any case be very similar.

#### 3.2.1.5 Semiconductor Memory Technology

Early computer memories were made almost entirely with magnetic devices. These covered the range from high speed RAMs (cores) to high density mass storage (disc, tape, drum). High cost prevented the use of early solid-state devices for memory applications except for latches and registers where high speed and direct interface with logic and control functions were required. However, as semiconductor technology advanced to the state of high integration and a corresponding reduction in the cost per bit solid-state RAM memories slowly replaced magnetic cores. Currently, semiconductor RAM memories span the range of access times from approximately 7 nanoseconds for very fast ECL types to several hundred nanoseconds for highly dense MOS dynamic types. An indication of this breadth of capability is shown in Figure 3-9.



Figure 3-9. Power vs Access Time

The semiconductor RAMs (and ROMs) currently being used in medium-to-large quantities are represented by the relatively small fast bipolar and MOS group and the large scale integrated medium speed bipolar and MOS group. The fast RAMs range in access times from 7 to 50 ns and cost from about 0.15 to 1.5 cent per bit. The slower large RAMs range in access times from 90 to 300 ns and cost from about 0.05 to 0.2 cents per bit.

Figure 3-9 indicates a relatively large gap between the VLSI RAMs and the slower, low cost magnetic discs with a relatively large storage capacity. In this gap it is proposed to introduce new 100K to 200K bit charge coupled devices and 1 Meg bit bubble memories. CCDs have access times spanning the range from several hundred microseconds to several milliseconds and are expected to cost in the range of ten millicents per bit, while bubble memories will have access times in the range of 10 milliseconds and cost below ten millicents per bit.

Another area not indicated in Figure 3-9 are a special class of devices known as EPROMs or EAROMs. These are field reprogrammable read only memories. They are finding applications in those areas where a variety of different mask programmable ROMs or fused PROMs are currently used, with the additional capability of being continually reprogrammed. Erasure with the EPROM is accomplished with ultraviolet light while for the EAROM it is accomplished electrically. They are still relatively slow as far as read access time in concerned (0.4 to 5 µs), but for applications requiring non-volatile, reprogrammable storage they are extremely useful. The EAROMs can serve as slow non-volatile RAMs. Many of these memories are being used as program memories for prototyping microprocessor systems.

Some of the characteristics of existing memory technologies and their future potential are listed in Table 3-8.

#### 3.2.1.5.1 MOS Memories

It would be interesting to look at the growth of MOS dynamic RAM memory technology over the past few years and what is expected in the near future. This can be accomplished with the aid of Figures 3-10 and 3-11 which indicate the growth in bits per chip and the reduction in memory cell size, respectively. Figure 3-10 shows that memory size has essentially doubled every year since 1969,

Table 3-8. Characteristics of Memory Technologies

|                                 |      |          |       |       |       |         |      |                        |         | /SERTAL/BLOCK | 10-200 MHz | DATA RATE |       |       |         | (SEBIAL/BIOCK | 300-1000 kHz | DATA RATE |
|---------------------------------|------|----------|-------|-------|-------|---------|------|------------------------|---------|---------------|------------|-----------|-------|-------|---------|---------------|--------------|-----------|
| POWER/BIT (mW) ACTIVE           | 0.01 | 0.005    | 0.002 | 0.005 | 0.002 | 0.5 µW  | 0.02 | 0.01                   | 0.002   | 100 nW        | 80 nW      | Ми 05     | 0.5-1 | .13   | .051    | 20 mW         | 10 nW        | Mu S .    |
| ACCESS TIME (NANOSECONDS)       | 50   | 25       | 10    | 100   | 20    | 25      | 150  | 120                    | 100     |               | •          | •         | 7-30  | 5-10  | 1-5     |               |              |           |
| MEMORY                          | 16K  | 64K      | 256K  | 64K   | 256K  | 1 Meg   | 32K  | 128K                   | 512K    | 128K          | 512K       | . 2 Meg   | 11K   | 5K-8K | 16K-32K | 100K-1M       | 4M           | 16M       |
| AREA/BIT (µ M)                  | 1000 | 250      | 100   | 200   | 20    | 25      | 009  | 200                    | 100     | 250           | 100        | 25        | 0009  | 3000  | 1000    | 200           | 100          | 20        |
| NUMBER OF<br>MASKS              | 7    | 9        | 9     | 7     | 9     | 9       | 9    | 9                      | 2       | 80            | 80         | 7         | œ     | 00    | 7       |               |              |           |
| LINE<br>RESOLUTION<br>(µ METERS | 2-4  | 1-2      | 0.5-1 | 2-4   | 1-2   | 0.5-1   | 2-4  | 1-2                    | 0.5-1   | 2-4           | 1-2        | 0.5-1     | 2-4   | 1-2   | 0.5-1   | 2-4           | 1-2          | 0.5-1     |
| *                               | 1979 | 1982     | 1985  | 1979  | 1982  | 1985    | 1979 | 1982                   | 1985    | 1979          | 1982       | 1985      | 1979  | 1982  | 1985    | 1979          | 1982         | 1985      |
| MEMORY TECHNOLOGY               |      | CMOS/SOS |       |       | NIMOS | DYNAMIC |      | $_{\rm I}^{2}_{\rm L}$ | DYNAMIC |               | GCD        |           |       | ECL   |         |               | BUBBLE       |           |



Figure 3-10. LSI Memory Growth



while the memory cell size has been reduced to only =300 µm². The cell reduction has been accomplished thru both a reduction in the number of MOS transistors per bit as well as significant improvements in process technology. Early 1K and 4K bit MOS dynamic RAMs had a 3 transistor cell while the later 4K and 16K devices have only a one transistor cell. Since cell size has not been reduced at the same rate that the memory size has increased, the physical size of the chips has had to increase. This was shown to be the general trend earlier. In some cases, manufacturers have maintained large device geometries and element counts in their cell design and drive circuits in order to increase access time. This has resulted in relatively large chip areas. Current 16K bit\*memory chips range in size from 40,000 mi1² to 18,000 mi1². The corresponding access timesrange from 90 to 200 ns. In most cases the smaller chip size is preferred because of yield and, hence, cost considerations. Thus manufacturers are attempting to introduce novel circuit and process concepts in order to maintain small chip areas while attempting to increase speed.

The 64K chips that have recently become available all use a one transistor per cell designs and range in total chip area from 28,000 mil<sup>2</sup> to 45,000 mil<sup>2</sup>. These area differences are probably reflecting process variations, the largest chip being single layer polysilicon and the smallest, double layer polysilicon for gate and storage electrodes.

A more recent concept that has been proposed for dynamic RAMs is the Charge Coupled RAM cell. This cell combines the storage gate and transfer gate in the conventional one-transistor cell into a single gate. CCD concepts have essentially been introduced through the alteration of the surface potential via ion implantation under a portion of the word line gate. This permits a word line to be used for both storage and transfer. Conventional one transistor cells require two lines for each function.

From the growth curve of Figure 3-10 it is indicated that a 256K bit MOS RAM will possibly be announded in 1982. For this to be a reality the cell size would have to be reduced dramatically in order to minimize the chip size. As an example in the typical RAM memory chip, the actual memory occupies about 35 to 40% of the chip area. Using a 300  $\mu$ m memory cell, the resultant chip size would be in excess of 100,000 mil<sup>2</sup>. If tighter layout rules are used that would reduce the cell area to 100  $\mu$ m, then the ship would be  $\approx$ 60,000 mil<sup>2</sup>. However, if a new concept is used, such as the Charge Coupled RAM cell, then cell sizes on the order of 50 to 100  $\mu$ m would be feasible. If this is coupled with a memory area array efficiency of 50%, then chips with total areas of between 30,000 and 40,000 mil<sup>2</sup> would be possible. The access time would range between 100 to 200 ns.

Currently a 64K bit CCD chip exists that occupies a chip area of 50,000 mil $^2$ . This chip was designed with relatively modest design rules (8  $\mu$ m feature size) and the basic cell occupies less than 300  $\mu$ m $^2$ . Of course, CCD memories that are block organized are much more memory efficient (50-60%).

#### 3.2.1.5.2 Bipolar Memories

Currently, bipolar static RAMs provide for the fastest access times. These devices are designed with emitter coupled logic (ECL) and range in sizes of from 64 to 1,025 bits. They range in access times from 7 to 30 ns and are designed

with ECL interface circuits that allow them to be controlled directly by subnanosecond ECL central processing units. The complex processing of ECL coupled with rather large cell sizes results in costs approaching 2 cents per bit. Since these memories are designed for speed, they dissipate milliwatts of power per bit. Thus, a 1K bit ECL memory dissipates in the range of 1 watt.

The TTL static memories are next in line with lower speed, power and larger integration potential. However, they are less costly, faster than comparable MOS devices, and thus find many applications in mid-range buffer systems.

One of the most powerful bipolar technolgies that has been recently introduced is integrated injection logic ( $I^2L$ ). It is providing a means for bipolar technology to be used in the memory areas that were usually reserved for MOS designs. The cell size of an  $I^2L$  bit is currently only  $\approx\!650~\mu\text{m}^2$ . The speed of an  $I^2L$  memory is twice that of a comparable MOS device, and power dissipations per bit are lower. It is expected that for the next few years the bit density will only be 1/2 that of MOS types. However, the  $I^2L$  dynamic memory does have a low cost potential and is going to compete intensely with MOS.

#### References

- (1) R. H. Dennard, et al; "Design of Ion-Implanted MOSFET's with Very Small Physical Dimensions", IEEE J. of Solid State Circuits, Vol SC-9, No. 5, October 1974.
- (2) R. M. Klaassen, "Design and Performance of Micronize Devices", Solid-State Electronics, Vol 21, No. 3-E, 1978.
- (3) R. Pashley, et al; "H-MOS Scales Traditional Devices to Higher Performance Level, Electronics, August 18, 1977.

#### 3.2.2 Components for a Processor Satellite

#### 3.2.2.1 Introduction

The previous sections have considered the expected increases in device technology performance during the 1980's. In this section we will discuss some of the specific components that will be available for use by 1985. The required integration levels are moderately ambitious in that integration levels are being proposed for logic and memory that currently do not exist. Therefore, it is

important that the selected technologies be somewhat mature, having demonstrated a significant VLSI potential, combined with very low functional element power dissipations at moderate speed.

Presently, for logic devices having gate propagation delays in the range of 1 to 5 nanoseconds, integration levels of 1000 to 5000 gates per chip have been demonstrated. For the present application this integration level will be extended to a maximum of 20,000 gates. In the case of memories, requirements will exist for RAM, ROM, and Serial structures, where integration levels will be directly related to the specific structure, its technology, and whether dynamic or static operation is chosen. Currently, bit densities have been realized in various configurations ranging from 4K-bits to 131K-bits. The smaller density would correspond to a static RAM, while the largest density is to be achieved for a serial CCD structure. Depending on the organization complexity of these memories the maximum achievable bit densities will vary. For example, if it is desired to perform combinations of memory and logic functions the CCDs, the maximum bit density may be limited to 200K-bits. In some cases iw will be required to combine either RAM or ROM on the same chip with complex logic functions. Here, appropriate sizing will be considered in order to obtain optimum densities for both functional structures. As an example, if a chip is to be configured with static RAM and computational logic, a possible mix is 5,000 to 10,000 gates of logic with a 10K-bits to 16K-bits static RAM. Using dynamic RAM the memory size could be extended to range from 32K-bits to 64K-bits. In all cases I/O will be limited to 100 pads.

In the following sections the anticipated critical chip designs will be discussed, along with their recommended technologies.

#### 3.2.2.2 Proposed Chip Designs and Related Technologies

There are seven to eight basic designs being considered for this processor implementation. These designs will require varying amounts of logic and memory, and at least three device technologies. For these designs the device technologies being considered have been demonstrated in varying levels of functional complexity. Except for the A/D converter, which is being proposed using either ECL or GaAs

MESFET technology, all the chip technologies are silicon MOS based. These MOS based technologies encompass CMOS/SOS, NMOS/SOS, NMOS/DIS (dielectrically-isolated-silicon), and CCD. In all cases a 2 µm linewidth resolution is required. For purposes of illustration 3 of these chip designs are discussed below.

# A) RAM Memory Chip

This chip can be configured as either a 32K-bit structure in CMOS/SOS static memory, organized 32K by 1, or a 128K-bit NMOS dynamic structure, organized 64K by 2. Either memory chip would be designed to have a read/write cycle time under 100 nanoseconds. The active power dissipation for the 128K-bit NMOS dynamic would be 500 mw, with a standby power of 100 mw. The active power for the static CMOS/SOS chip would be 160 mw. However, its standby power would be approximately 10  $\mu$ w.

Since average power is a prime factor in the selection of any of the proposed chip designs it is essential to combine operating power and standby power into an overall average. If these memory chips are operating at a 10% duty cycle then the CMOS/SOS structure is, by far, the best choice. The overall power per bit would be reduced by a factor of 10 (from 5 µw/bit to 0.5 µw/bit). The number of individual chips in CMOS/SOS would be greater by a factor of 4 than in NMOS dynamic. However, the I/O requirements are not excessive, and could be easily accommodated using an efficient packaging concept, as will be discussed later. Therefore, it is recommended that this memory chip be configured using CMOS/SOS.

## B) Core Processor Chip

This chip requires a combination of logic and memory. The logic gate complexity is approximately 5000 gates, consisting primarily of 2 multipliers and an ALU. The register memory density would be 5K-bits, organized 512K by 9. As noted earlier, 16K of control RAM is also included on this chip. Configured in CMOS/SOS this chip would dissipate approximately 150 mw active power. The standby power would be in the range of 10 µw. Configured in NMOS this chip would dissipate more than 2.5 watts active power. Therefore, CMOS/SOS is the preferred choice. A cycle time of 50 to 100 ns is proposed for this chip.

AD-A066 455

RAYTHEON CO SUDBURY MASS EQUIPMENT DIV F/6 22/2 MILSATCOM SPACECRAFT PROCESSSING STUDY. TECHNOLOGICAL LIMITATIO--ETC(U) NOV 78 A A CASTRO, J EACHUS, F HOWES, E LEWIS DCA100-78-C-0012 ER78-4370 SBIE-AD-E100 167 NL

UNCLASSIFIED

2 of 2 AD A088455







END DATE FILMED 5 -79

DDC

## C) A/D Converter Chip

Two proposed organizations using either ECL or GaAs MESFET technology have been considered. These consisted of a successive approximation organization and a feed/forward organization. For the nearterm the successive approximation type would appear to be the most feasible. It would provide for a 8-bit conversion at a sample rate of up to 50 megasamples/sec with a total power dissipation of 150 mw. A preliminary design using GaAs logic resulted in only 800 GaAs MESFETs on a chip 60 x 80 mils. The producibility of this device in a 5-year development effort is considered to be good.

# 3.2.2.3 Conclusions and Packaging Considerations

The devices discussed above indicate the complexity and performance of the basic buildig elements for this processor. It is anticipated that this effort will involve a low level of risk. The anticipated benefits are in achieving a level of device integration with a significant reductions in power consumption. In order to achieve the element densities proposed in these designs, a moderate level of technology improvement is going to be required. A number of advanced concepts have already been proposed and demonstrated by many R and D laboratories for both device processing and linewidth definition. It is expected that by 1985 these laboratory concepts will be a production reality.

Concurrent with these device development efforts a suitable packaging technology must be developed. A VLSI hybrid packaging concept using sapphire substrates for chip interconnect is a viable candidate for this application. It provides for the necessary low interconnect drive power that is going to be required in order to adequately interface these VLSI devices. On-chip driver power will be reduced by an order of magnitude over conventional packaging techniques.

APPENDIX A

# APPENDIX A BIT RIPPLER DISCUSSION

One of the major obstacles in realizing the potential of subelement redundancy is the need for a switch that can configure the various subelements into an operational system without simultaneously dissipating much of the promised gain in reliability. A siwtching device, called a "rippler switch," was designed to overcome this obstacle. Basically, the rippler functions by replacing any defective subelement in a linear array of identical devices by its nearest neighbor. That device is then replaced by its nearest neighbor, etc., the whole process "rippling" down the array of subelements until the last active device is replaced by the first available spare. The advantage of this switching method over the more conventional direct substitution approach is in its amenability to relatively simple (and hence reliably implemented) control algorithms.

Figure A-l demonstrates how the rippler accomplishes the desired switching for an element which is partitioned into five identical subelements with three spare subelements. Each subelement has associated with it a rippler slice which can assume any one of five states. The rippler states establish the data paths (indicated by the lines in Figure A-l) which provide the link between the operating subelements and a set of input/output ports (represented by the numbers 1 through 5), thereby defining the functional role of each subselement. If the rippler slice associated with subelement i is in state  $R_0$ , the subelement i is connect to function i; if in state  $R_1$ , then subelement i is connected to function i-1. Similarly, in states  $R_2$  and  $R_3$ , subelement i is connected to functions i-2 and i-3, respectively. Finally, in state  $R_4$ , subelement i is not connected to any function. Initially, each rippler slice is in state  $R_0$ , and the data paths are as shown in Figure A-1(a).

Now, suppose subelement  $S_4$  fails. The fourth rippler slice is set to  $R_{\star}$ , and all subsequent rippler slices are advanced one state to  $R_1$ . As shown in Figure A-1(b), the defective subelement is rippler out, and the first spare



To State of the last

(present)

Decreptor .

Participation of

Mercury

Total Control

Figure A-1. Typical Rippler Sequence

subelement is used. Similarly, Figure A-1(c) shows the response to a second failure, this one in subelement  $S_2$ . The second rippler slice is set to state  $R_{\star}$ , the third rippler slice advances one state to  $R_1$ , the fourth slice remains in state  $R_{\star}$ , and the subsequent slices are advanced one state to  $R_2$ . As a result, both failed subelements are rippled out and the first two spare subelements are rippled in. It is readily seen that this control algorithm will always result in an operational system, regardless of the order in which the failures occur, so long as the number of failures does not exceed the number of available spare.

Subelement redundancy is employed for the memory bit lines in each processor requiring a 64K word memory. This memory is partitioned into 25 bit lines with 3 spare bit lines.